{"id":208,"date":"2026-03-03T13:31:41","date_gmt":"2026-03-03T08:31:41","guid":{"rendered":"https:\/\/gigz.pk\/python\/?post_type=lesson&#038;p=208"},"modified":"2026-03-22T19:18:35","modified_gmt":"2026-03-22T14:18:35","slug":"transforming-data-using-pandas","status":"publish","type":"lesson","link":"https:\/\/gigz.pk\/python\/lesson\/transforming-data-using-pandas\/","title":{"rendered":"Transforming Data Using Pandas"},"content":{"rendered":"\n<p>Data transformation is the process of cleaning, modifying, and preparing raw data for analysis.<\/p>\n\n\n\n<p><strong>Pandas<\/strong> is one of the most powerful Python libraries for data transformation and manipulation.<\/p>\n\n\n\n<p>It is widely used in:<\/p>\n\n\n\n<p>Data Engineering<br>Data Analysis<br>Machine Learning<br>ETL Pipelines<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">1. Loading Data<\/h1>\n\n\n\n<p>Before transforming, we load the data.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">import pandas as pddf = pd.read_csv(\"sales.csv\")<br>print(df.head())<\/pre>\n\n\n\n<p>Now the data is ready for transformation.<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">2. Handling Missing Values<\/h1>\n\n\n\n<p>Check missing values:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">print(df.isnull().sum())<\/pre>\n\n\n\n<p>Remove missing values:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">df = df.dropna()<\/pre>\n\n\n\n<p>Fill missing values:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">df[\"price\"] = df[\"price\"].fillna(0)<\/pre>\n\n\n\n<h1 class=\"wp-block-heading\">3. Removing Duplicates<\/h1>\n\n\n\n<pre class=\"wp-block-preformatted\">df = df.drop_duplicates()<\/pre>\n\n\n\n<p>This ensures clean and accurate data.<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">4. Filtering Data<\/h1>\n\n\n\n<p>Filter rows:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">df = df[df[\"price\"] &gt; 100]<\/pre>\n\n\n\n<p>Filter multiple conditions:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">df = df[(df[\"price\"] &gt; 100) &amp; (df[\"quantity\"] &gt;= 2)]<\/pre>\n\n\n\n<h1 class=\"wp-block-heading\">5. Selecting Columns<\/h1>\n\n\n\n<pre class=\"wp-block-preformatted\">df = df[[\"product\", \"price\", \"quantity\"]]<\/pre>\n\n\n\n<p>Selecting only required columns improves performance.<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">6. Creating New Columns<\/h1>\n\n\n\n<pre class=\"wp-block-preformatted\">df[\"total\"] = df[\"price\"] * df[\"quantity\"]<\/pre>\n\n\n\n<p>This creates a calculated column.<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">7. Changing Data Types<\/h1>\n\n\n\n<pre class=\"wp-block-preformatted\">df[\"quantity\"] = df[\"quantity\"].astype(\"int32\")<br>df[\"date\"] = pd.to_datetime(df[\"date\"])<\/pre>\n\n\n\n<p>Correct data types improve accuracy and memory usage.<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">8. Renaming Columns<\/h1>\n\n\n\n<pre class=\"wp-block-preformatted\">df = df.rename(columns={\"product_name\": \"product\"})<\/pre>\n\n\n\n<h1 class=\"wp-block-heading\">9. Grouping and Aggregation<\/h1>\n\n\n\n<p>Group data and calculate totals:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">summary = df.groupby(\"product\")[\"total\"].sum()<br>print(summary)<\/pre>\n\n\n\n<p>Multiple aggregations:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">summary = df.groupby(\"product\").agg({<br>    \"total\": \"sum\",<br>    \"quantity\": \"mean\"<br>})<\/pre>\n\n\n\n<h1 class=\"wp-block-heading\">10. Sorting Data<\/h1>\n\n\n\n<pre class=\"wp-block-preformatted\">df = df.sort_values(by=\"total\", ascending=False)<\/pre>\n\n\n\n<h1 class=\"wp-block-heading\">11. Merging DataFrames<\/h1>\n\n\n\n<p>Combine two datasets:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">df1 = pd.read_csv(\"customers.csv\")<br>df2 = pd.read_csv(\"orders.csv\")merged = pd.merge(df1, df2, on=\"customer_id\", how=\"inner\")<\/pre>\n\n\n\n<p>Join types:<\/p>\n\n\n\n<p>inner<br>left<br>right<br>outer<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">12. Applying Custom Functions<\/h1>\n\n\n\n<pre class=\"wp-block-preformatted\">df[\"discounted_price\"] = df[\"price\"].apply(lambda x: x * 0.9)<\/pre>\n\n\n\n<p>Apply custom logic to columns.<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">13. Pivot Tables<\/h1>\n\n\n\n<p>Create summary table:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">pivot = df.pivot_table(<br>    values=\"total\",<br>    index=\"product\",<br>    columns=\"region\",<br>    aggfunc=\"sum\"<br>)<\/pre>\n\n\n\n<p>Useful for reporting and dashboards.<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">14. Export Transformed Data<\/h1>\n\n\n\n<pre class=\"wp-block-preformatted\">df.to_csv(\"cleaned_sales.csv\", index=False)<\/pre>\n\n\n\n<p>Or save to database for analytics.<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">Real-World ETL Example<\/h1>\n\n\n\n<p>Extract:<\/p>\n\n\n\n<p>Load raw sales data<\/p>\n\n\n\n<p>Transform:<\/p>\n\n\n\n<p>Remove duplicates<br>Handle missing values<br>Calculate total revenue<br>Aggregate by product<\/p>\n\n\n\n<p>Load:<\/p>\n\n\n\n<p>Save cleaned data into data warehouse<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">Best Practices<\/h1>\n\n\n\n<p>Avoid loops, use vectorized operations<br>Handle missing values carefully<br>Use proper data types<br>Document transformations<br>Validate output data<br>Optimize memory usage<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">Common Mistakes<\/h1>\n\n\n\n<p>Modifying original data without backup<br>Ignoring data types<br>Using loops instead of vectorization<br>Not checking missing values<br>Not validating final dataset<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">Key Takeaway<\/h1>\n\n\n\n<p>Transforming data using Pandas involves cleaning, filtering, aggregating, and restructuring datasets to make them analysis-ready.<\/p>\n\n\n\n<p>Pandas provides powerful and efficient tools that are essential for ETL processes and modern data workflows.<\/p>\n\n\n<div class=\"yoast-breadcrumbs\"><span><span><a href=\"https:\/\/gigz.pk\/python\/\">Home<\/a><\/span> \u00bb <span class=\"breadcrumb_last\" aria-current=\"page\">PYTHON FOR DATA ENGINEERING (PYDE) > ETL and Data Pipelines > Transforming Data Using Pandas<\/span><\/span><\/div>\n\n\n<div class=\"schema-faq wp-block-yoast-faq-block\"><div class=\"schema-faq-section\" id=\"faq-question-1774189057528\"><strong class=\"schema-faq-question\"><\/strong> <p class=\"schema-faq-answer\"><\/p> <\/div> <\/div>\n","protected":false},"menu_order":123,"template":"","class_list":["post-208","lesson","type-lesson","status-publish","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.5 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Transforming Data Using Pandas - One Language. Endless Possibilities<\/title>\n<meta name=\"description\" content=\"Learn data transformation in Python using Pandas\u2014clean, filter, aggregate, and restructure datasets for ETL and analytics efficiently.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/gigz.pk\/python\/lesson\/transforming-data-using-pandas\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Transforming Data Using Pandas - One Language. Endless Possibilities\" \/>\n<meta property=\"og:description\" content=\"Learn data transformation in Python using Pandas\u2014clean, filter, aggregate, and restructure datasets for ETL and analytics efficiently.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/gigz.pk\/python\/lesson\/transforming-data-using-pandas\/\" \/>\n<meta property=\"og:site_name\" content=\"One Language. Endless Possibilities\" \/>\n<meta property=\"article:modified_time\" content=\"2026-03-22T14:18:35+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":[\"WebPage\",\"FAQPage\"],\"@id\":\"https:\\\/\\\/gigz.pk\\\/python\\\/lesson\\\/transforming-data-using-pandas\\\/\",\"url\":\"https:\\\/\\\/gigz.pk\\\/python\\\/lesson\\\/transforming-data-using-pandas\\\/\",\"name\":\"Transforming Data Using Pandas - One Language. Endless Possibilities\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/gigz.pk\\\/python\\\/#website\"},\"datePublished\":\"2026-03-03T08:31:41+00:00\",\"dateModified\":\"2026-03-22T14:18:35+00:00\",\"description\":\"Learn data transformation in Python using Pandas\u2014clean, filter, aggregate, and restructure datasets for ETL and analytics efficiently.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/gigz.pk\\\/python\\\/lesson\\\/transforming-data-using-pandas\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/gigz.pk\\\/python\\\/lesson\\\/transforming-data-using-pandas\\\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/gigz.pk\\\/python\\\/lesson\\\/transforming-data-using-pandas\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/gigz.pk\\\/python\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"PYTHON FOR DATA ENGINEERING (PYDE) > ETL and Data Pipelines > Transforming Data Using Pandas\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/gigz.pk\\\/python\\\/#website\",\"url\":\"https:\\\/\\\/gigz.pk\\\/python\\\/\",\"name\":\"One Language. Endless Possibilities\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/gigz.pk\\\/python\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Transforming Data Using Pandas - One Language. Endless Possibilities","description":"Learn data transformation in Python using Pandas\u2014clean, filter, aggregate, and restructure datasets for ETL and analytics efficiently.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/gigz.pk\/python\/lesson\/transforming-data-using-pandas\/","og_locale":"en_US","og_type":"article","og_title":"Transforming Data Using Pandas - One Language. Endless Possibilities","og_description":"Learn data transformation in Python using Pandas\u2014clean, filter, aggregate, and restructure datasets for ETL and analytics efficiently.","og_url":"https:\/\/gigz.pk\/python\/lesson\/transforming-data-using-pandas\/","og_site_name":"One Language. Endless Possibilities","article_modified_time":"2026-03-22T14:18:35+00:00","twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":["WebPage","FAQPage"],"@id":"https:\/\/gigz.pk\/python\/lesson\/transforming-data-using-pandas\/","url":"https:\/\/gigz.pk\/python\/lesson\/transforming-data-using-pandas\/","name":"Transforming Data Using Pandas - One Language. Endless Possibilities","isPartOf":{"@id":"https:\/\/gigz.pk\/python\/#website"},"datePublished":"2026-03-03T08:31:41+00:00","dateModified":"2026-03-22T14:18:35+00:00","description":"Learn data transformation in Python using Pandas\u2014clean, filter, aggregate, and restructure datasets for ETL and analytics efficiently.","breadcrumb":{"@id":"https:\/\/gigz.pk\/python\/lesson\/transforming-data-using-pandas\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/gigz.pk\/python\/lesson\/transforming-data-using-pandas\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/gigz.pk\/python\/lesson\/transforming-data-using-pandas\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/gigz.pk\/python\/"},{"@type":"ListItem","position":2,"name":"PYTHON FOR DATA ENGINEERING (PYDE) > ETL and Data Pipelines > Transforming Data Using Pandas"}]},{"@type":"WebSite","@id":"https:\/\/gigz.pk\/python\/#website","url":"https:\/\/gigz.pk\/python\/","name":"One Language. Endless Possibilities","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/gigz.pk\/python\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"_links":{"self":[{"href":"https:\/\/gigz.pk\/python\/wp-json\/wp\/v2\/lesson\/208","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/gigz.pk\/python\/wp-json\/wp\/v2\/lesson"}],"about":[{"href":"https:\/\/gigz.pk\/python\/wp-json\/wp\/v2\/types\/lesson"}],"wp:attachment":[{"href":"https:\/\/gigz.pk\/python\/wp-json\/wp\/v2\/media?parent=208"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}