Easy Pandas Tutorial Cheat Code

This cheat code provides a quick overview of essential and advanced Pandas operations for data manipulation. Use these snippets to handle, clean, and analyze your data efficiently.

1. Import Pandas

import pandas as pd

2. Create a DataFrame

data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30]}
df = pd.DataFrame(data)
print(df)

3. Read and Write CSV Files

# Read CSV
df = pd.read_csv('file.csv')

# Write to CSV
df.to_csv('output.csv', index=False)

4. Inspect Data

print(df.head())   # First 5 rows
print(df.info())  # Data types and summary

5. Select Columns

age_column = df['Age']
subset = df[['Name', 'Age']]

6. Filter Rows

filtered = df[df['Age'] > 25]

7. Add and Remove Columns

# Add a column
df['Salary'] = [50000, 60000]

# Drop a column
df = df.drop(columns=['Salary'])

8. Sorting

# Sort by age
df_sorted = df.sort_values(by='Age', ascending=False)

9. GroupBy and Aggregation

# Group by and calculate mean
grouped = df.groupby('Age').mean()

10. Handle Missing Data

# Fill missing values
df['Age'] = df['Age'].fillna(0)

# Drop rows with missing values
df = df.dropna()

11. Merge and Join

# Merge two DataFrames
df1 = pd.DataFrame({'ID': [1, 2], 'Name': ['Alice', 'Bob']})
df2 = pd.DataFrame({'ID': [1, 2], 'Score': [90, 80]})
merged = pd.merge(df1, df2, on='ID')

12. Concatenate DataFrames

concat = pd.concat([df1, df2], axis=0)

13. Apply Functions

# Apply a custom function
df['AgeSquared'] = df['Age'].apply(lambda x: x ** 2)

14. Pivot Table

# Create a pivot table
pivot = df.pivot_table(values='Score', index='Name', aggfunc='mean')

15. Iterating Over Rows

for index, row in df.iterrows():
    print(row['Name'], row['Age'])

16. Reset Index

df = df.reset_index(drop=True)

17. Set Index

df = df.set_index('Name')

18. String Operations

# Convert names to lowercase
df['Name'] = df['Name'].str.lower()

19. Convert Data Types

# Convert Age to float
df['Age'] = df['Age'].astype(float)

20. Save to Excel

df.to_excel('output.xlsx', index=False)

21. Load Excel File

df = pd.read_excel('file.xlsx')

22. Handle Duplicate Rows

# Drop duplicates
df = df.drop_duplicates()

23. Rename Columns

df = df.rename(columns={'Name': 'FullName'})

24. Check for Null Values

null_check = df.isnull().sum()

25. Get Column Statistics

mean_age = df['Age'].mean()
sum_age = df['Age'].sum()

Leave a Reply

Your email address will not be published. Required fields are marked *