{"id":57,"date":"2026-04-03T11:00:14","date_gmt":"2026-04-03T11:00:14","guid":{"rendered":"https:\/\/gigz.pk\/ml\/?post_type=lesson&#038;p=57"},"modified":"2026-04-07T05:58:53","modified_gmt":"2026-04-07T05:58:53","slug":"train-test-split","status":"publish","type":"lesson","link":"https:\/\/gigz.pk\/ml\/lesson\/train-test-split\/","title":{"rendered":"Train-Test Split"},"content":{"rendered":"\n<p>Train-Test Split is a fundamental step in Machine Learning used to evaluate how well a model performs on new, unseen data. It involves dividing the dataset into two separate parts: one for training the model and one for testing its performance.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Why Train-Test Split is Important<\/h2>\n\n\n\n<p>When building a Machine Learning model, it is important to check whether the model can generalize to new data. If a model is evaluated only on the data it was trained on, it may give overly optimistic results. Using a separate test set ensures that the model\u2019s performance reflects its real-world effectiveness.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">How Train-Test Split Works<\/h2>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Training Set:<\/strong> This portion of the data is used to train the model. The model learns patterns, relationships, and features from this data.<\/li>\n\n\n\n<li><strong>Test Set:<\/strong> This portion of the data is kept separate and used to evaluate the model\u2019s performance after training. It provides an unbiased assessment of how the model will perform on new data.<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\">Common Split Ratios<\/h2>\n\n\n\n<p>A common practice is to split the data into 70% for training and 30% for testing. Other ratios such as 80\/20 or 75\/25 are also used depending on the dataset size.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Using Python for Train-Test Split<\/h2>\n\n\n\n<p>In Python, the <code>train_test_split<\/code> function from the <code>scikit-learn<\/code> library is commonly used. It allows you to randomly divide the data into training and testing sets and ensures reproducibility with a random seed.<\/p>\n\n\n\n<p>Example:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">from sklearn.model_selection import train_test_splitX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>Train-Test Split is essential for validating Machine Learning models. By keeping training and testing data separate, you can accurately measure the model\u2019s performance and avoid overfitting. This ensures that the model can make reliable predictions on new, unseen data.<\/p>\n\n\n<div class=\"yoast-breadcrumbs\"><span><span><a href=\"https:\/\/gigz.pk\/ml\/\">Home<\/a><\/span> \u00bb <span class=\"breadcrumb_last\" aria-current=\"page\">Machine Learning Foundations > Data Preparation > Train-Test Split<\/span><\/span><\/div>\n\n\n<div class=\"schema-faq wp-block-yoast-faq-block\"><div class=\"schema-faq-section\" id=\"faq-question-1775541521064\"><strong class=\"schema-faq-question\"><\/strong> <p class=\"schema-faq-answer\"><\/p> <\/div> <\/div>\n\n\n\n<div class=\"schema-faq wp-block-yoast-faq-block\"><div class=\"schema-faq-section\" id=\"faq-question-1775541520801\"><strong class=\"schema-faq-question\"><\/strong> <p class=\"schema-faq-answer\"><\/p> <\/div> <\/div>\n","protected":false},"menu_order":14,"template":"","class_list":["post-57","lesson","type-lesson","status-publish","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.6 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Train-Test Split - Machine Learning Mastery<\/title>\n<meta name=\"description\" content=\"Learn train test split in machine learning \u2014 divide datasets for training and testing to evaluate ML model performance accurately.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/gigz.pk\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Train-Test Split - Machine Learning Mastery\" \/>\n<meta property=\"og:description\" content=\"Learn train test split in machine learning \u2014 divide datasets for training and testing to evaluate ML model performance accurately.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/gigz.pk\/\" \/>\n<meta property=\"og:site_name\" content=\"Machine Learning Mastery\" \/>\n<meta property=\"article:modified_time\" content=\"2026-04-07T05:58:53+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":[\"WebPage\",\"FAQPage\"],\"@id\":\"https:\\\/\\\/gigz.pk\\\/ml\\\/lesson\\\/train-test-split\\\/\",\"url\":\"https:\\\/\\\/gigz.pk\\\/\",\"name\":\"Train-Test Split - Machine Learning Mastery\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/gigz.pk\\\/ml\\\/#website\"},\"datePublished\":\"2026-04-03T11:00:14+00:00\",\"dateModified\":\"2026-04-07T05:58:53+00:00\",\"description\":\"Learn train test split in machine learning \u2014 divide datasets for training and testing to evaluate ML model performance accurately.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/gigz.pk\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/gigz.pk\\\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/gigz.pk\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/gigz.pk\\\/ml\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Machine Learning Foundations > Data Preparation > Train-Test Split\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/gigz.pk\\\/ml\\\/#website\",\"url\":\"https:\\\/\\\/gigz.pk\\\/ml\\\/\",\"name\":\"Machine Learning Mastery\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/gigz.pk\\\/ml\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Train-Test Split - Machine Learning Mastery","description":"Learn train test split in machine learning \u2014 divide datasets for training and testing to evaluate ML model performance accurately.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/gigz.pk\/","og_locale":"en_US","og_type":"article","og_title":"Train-Test Split - Machine Learning Mastery","og_description":"Learn train test split in machine learning \u2014 divide datasets for training and testing to evaluate ML model performance accurately.","og_url":"https:\/\/gigz.pk\/","og_site_name":"Machine Learning Mastery","article_modified_time":"2026-04-07T05:58:53+00:00","twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":["WebPage","FAQPage"],"@id":"https:\/\/gigz.pk\/ml\/lesson\/train-test-split\/","url":"https:\/\/gigz.pk\/","name":"Train-Test Split - Machine Learning Mastery","isPartOf":{"@id":"https:\/\/gigz.pk\/ml\/#website"},"datePublished":"2026-04-03T11:00:14+00:00","dateModified":"2026-04-07T05:58:53+00:00","description":"Learn train test split in machine learning \u2014 divide datasets for training and testing to evaluate ML model performance accurately.","breadcrumb":{"@id":"https:\/\/gigz.pk\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/gigz.pk\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/gigz.pk\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/gigz.pk\/ml\/"},{"@type":"ListItem","position":2,"name":"Machine Learning Foundations > Data Preparation > Train-Test Split"}]},{"@type":"WebSite","@id":"https:\/\/gigz.pk\/ml\/#website","url":"https:\/\/gigz.pk\/ml\/","name":"Machine Learning Mastery","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/gigz.pk\/ml\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"_links":{"self":[{"href":"https:\/\/gigz.pk\/ml\/wp-json\/wp\/v2\/lesson\/57","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/gigz.pk\/ml\/wp-json\/wp\/v2\/lesson"}],"about":[{"href":"https:\/\/gigz.pk\/ml\/wp-json\/wp\/v2\/types\/lesson"}],"wp:attachment":[{"href":"https:\/\/gigz.pk\/ml\/wp-json\/wp\/v2\/media?parent=57"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}