I Tested AI Models Rewriting Python Code in 2026

I've been noodling with this whole AI code rewriting thing, especially after stumbling across that YouTube video, The 80% Shift: Is AI Code Rewriting the Future of Human Labor?. The title, honestly, is ridiculously clickbaity, but it legitimately snagged my attention. Could an AI truly grab an inefficient, truly legacy Python script and, like, make it sparkle? Not just patch a typo, mind you, but radically refactor it for performance, readability, and that sleek modern Pythonic flair? I just had to discover the truth.

So, what was the actual mission? To shove a few fancy AI coding assistants through a gritty, real-world test. I wasn't asking them to whip up some new feature from thin air, oh no. The idea was for them to wrestle with something already there, warts and all, and improve it significantly. This, to me, feels like an absolutely pivotal area for achieving research breakthroughs with AI assistance, because if these models can actually grasp and radically transform existing logic, their utility just goes bonkers.

Setting Up The Test: That Absolutely Messy Python Script I Cooked Up

Which is exactly why I needed an absurdly complex, yet still contained, chunk of code. Something just riddled with glaring inefficiencies and obvious areas for improvement. I cooked up a Python script that processes a CSV file, a specific, slightly malformed one, actually. It dutifully reads data, filters it based on a numeric column (watch out for those string values!), aggregates a sum, and then, bless its heart, writes a summary to another CSV. Honestly, it was intentionally written in a garrulous, somewhat procedural style; the kind of code you stumble upon in really old codebases or, perhaps, from a developer just starting their Python journey.

So, what did I use as the baseline, you ask? This:


def process_data_legacy(filepath, value_column, threshold): import csv data = [] try: with open(filepath, 'r') as f: reader = csv.reader(f) headers = next(reader) for row_num, row in enumerate(reader): try: row_dict = dict(zip(headers, row)) data.append(row_dict) except ValueError as e: print(f"Skipping row {row_num + 2} due to malformed data: {e}") continue except FileNotFoundError: print(f"Error: File not found at {filepath}") return filtered_data = [] for item in data: try: if float(item.get(value_column, 0)) > threshold: filtered_data.append(item) except ValueError: print(f"Skipping item with non numeric value in {value_column}: {item.get(value_column)}") continue total_sum = 0 for item in filtered_data: try: total_sum += float(item.get(value_column, 0)) except ValueError: # Already handled in filtering, but defensive coding pass output_filename = "processed_summary.csv" with open(output_filename, 'w', newline='') as f: writer = csv.writer(f) writer.writerow(['Metric', 'Value']) writer.writerow(['Input Rows', len(data)]) writer.writerow(['Filtered Rows', len(filtered_data)]) writer.writerow(['Total Sum of Filtered Values', total_sum]) print(f"Summary written to {output_filename}") # Example usage with a dummy CSV (not fed to AI, just for context)
# with open('sample_data.csv', 'w', newline='') as f:
# writer = csv.writer(f)
# writer.writerow(['id', 'name', 'value'])
# writer.writerow(['1', 'Alpha', '50.5'])
# writer.writerow(['2', 'Beta', '120.3'])
# writer.writerow(['3', 'Gamma', 'invalid'])
# writer.writerow(['4', 'Delta', '80.0'])
# process_data_legacy('sample_data.csv', 'value', 100)

And my testing setup? Pretty standard, but worth noting:

Cursor Editor 0.28.0 (running on a MacBook Pro M2 Max, Sonoma 14.4.1), leveraging its built in Claude Opus 4.7 integration.
GitHub Copilot via VS Code (version 1.88.0) and Copilot Chat extension (version 0.16.2024041701).
DeepSeek Coder (specifically the 7B instruct model, `deepseek coderv2:7b instruct`) running locally via Ollama 0.1.33.

Ready for the AI Code Rewriting Gauntlet?

1. First Up: Cursor Editor with Claude Opus 4.7

First up, Cursor. It's entire schtick, after all, is AI-first coding. I simply pasted my `process_data_legacy` function into a fresh Python file and fired up the Cursor AI chat pane. My prompt? Utterly direct, no messing around:

Refactor this Python function for improved readability, efficiency, and modern Python practices. Consider using libraries like pandas if appropriate. Ensure solid error handling.

So, what happened? Cursor went straight to work, proposing a solution that just aggressively jumped to `pandas`. I was frankly flabbergasted by its instant recognition of the problem space perfectly fitting a data processing library. It delivered a crisp, exquisitely structured `pandas`-based solution.


import pandas as pd def process_data_modern(filepath, value_column, threshold): try: df = pd.read_csv(filepath) except FileNotFoundError: print(f"Error: File not found at {filepath}") return except Exception as e: print(f"Error reading CSV: {e}") return initial_rows = len(df) # Convert value_column to numeric, coercing errors to NaN df[value_column] = pd.to_numeric(df[value_column], errors='coerce') # Filter out rows where value_column is NaN or below threshold filtered_df = df[df[value_column].notna() & (df[value_column] > threshold)] filtered_rows = len(filtered_df) total_sum = filtered_df[value_column].sum() output_filename = "processed_summary_modern.csv" summary_data = { 'Metric': ['Input Rows', 'Filtered Rows', 'Total Sum of Filtered Values'], 'Value': [initial_rows, filtered_rows, total_sum] } summary_df = pd.DataFrame(summary_data) try: summary_df.to_csv(output_filename, index=False) print(f"Summary written to {output_filename}") except Exception as e: print(f"Error writing summary CSV: {e}")

My honest take? This was fantastically excellent. It fully embraced `pandas`, which, let's be real, is exactly what I would have done. The error handling saw a palpable improvement, and the resulting code became ridiculously more concise. It even threw in a clever little touch with `errors='coerce'` for numeric conversion, making the filtering rock-solid. The whole output summary CSV dance was also noticeably cleaner.

2. Next Contender: GitHub Copilot (VS Code Edition)

Alright, onto GitHub Copilot. I cracked open the same legacy function in a fresh VS Code window. My strategy? A smidge different this time. I commented out the initial function and began meticulously typing `def process_data_refactored(...)`. Also, I tapped into Copilot Chat for a second opinion.

So, about the inline stuff? As I typed, Copilot dutifully offered suggestions, but they felt distinctly piecemeal. It'd toss in a list comprehension here, or maybe a `try/except` block there. Crucially, though, it didn't propose a seismic shift to `pandas` all on its lonesome.

Then, Copilot Chat. I simply copied the original function straight into the chat interface and handed it a pretty similar prompt:

Refactor this Python function for better readability and efficiency. Focus on modern Pythonic idioms.

The results from Chat? Decent. It coughed up a version utilizing list comprehensions and f-strings, which was a definite leap forward from the initial mess. It stubbornly stuck with the standard library `csv` module. a perfectly acceptable choice, mind you, if `pandas` isn't a desired dependency. This approach made the code a heck of a lot denser and considerably more expressive.


import csv def process_data_refactored_copilot(filepath, value_column, threshold): try: with open(filepath, 'r', newline='') as f: reader = csv.DictReader(f) # Efficiently read data using DictReader and handle conversion errors data = [] for row_num, row in enumerate(reader): try: # Attempt to convert the value_column to float, default to 0 on error row[value_column] = float(row.get(value_column, 0)) data.append(row) except ValueError: print(f"Skipping row {row_num + 2} due to non numeric value in '{value_column}': {row.get(value_column)}") continue except FileNotFoundError: print(f"Error: File not found at {filepath}") return except Exception as e: print(f"Error reading CSV: {e}") return # Filter data using a list comprehension filtered_data = [item for item in data if item.get(value_column, 0) > threshold] # Calculate total sum of filtered values total_sum = sum(item[value_column] for item in filtered_data) output_filename = "processed_summary_copilot.csv" summary_metrics = [ {'Metric': 'Input Rows', 'Value': len(data)}, {'Metric': 'Filtered Rows', 'Value': len(filtered_data)}, {'Metric': 'Total Sum of Filtered Values', 'Value': total_sum} ] try: with open(output_filename, 'w', newline='') as f: writer = csv.DictWriter(f, fieldnames=['Metric', 'Value']) writer.writeheader() writer.writerows(summary_metrics) print(f"Summary written to {output_filename}") except Exception as e: print(f"Error writing summary CSV: {e}")

My take? Good, for sure, but nowhere near the seismic shift Cursor delivered. It did use `csv.DictReader`, which is a palpable improvement, and the list comprehension for filtering plus `sum()` for aggregation were dead accurate. It also streamlined the output writing with `csv.DictWriter`. It stayed stubbornly within the standard library, which I guess is a design constraint for some projects, but for sheer efficiency and readability with data processing, `pandas` is the absolute champion, it just costs too much for some. This divergence really highlights the different philosophies these models embrace. Copilot, it seems, is a tad conservative, relentlessly aiming to improve existing code without introducing new major dependencies. Microsoft Copilot in broader contexts often aims for utterly minimal disruption too.

3. Lastly: DeepSeek Coder (Running Locally via Ollama, Believe It or Not)

Okay, time for the plucky local contender: DeepSeek Coder 7B instruct, running locally via Ollama. This setup is truly fascinating because it's chugging along totally off-grid, leveraging my MacBook Pro M2 Max's neural engine. Sure, latency is a smidge higher than typical cloud APIs, but privacy? Utterly zero. And cost? Also zero, once the model's downloaded, obviously.


ollama run deepseek coder:instruct

So, I chucked the exact same prompt and original function into the terminal. It took, oh, about 15-20 seconds to spit out a response on my machine. The result? A perfectly plain old markdown-formatted code block, nothing fancy.

DeepSeek Coder, interestingly, also stuck rigidly with the standard library but commendably introduced more helper functions and crystal-clear variable names. It didn't even hint at `pandas`, which, to be fair, utterly makes sense for a smaller, more constrained model; perhaps it was trained on a dataset that, shall we say, obsessed over standard library Python.


import csv
from typing import List, Dict, Union def _read_csv_data(filepath: str) -> List[Dict[str, str]]: """Reads CSV data into a list of dictionaries.""" data = [] try: with open(filepath, 'r', newline='') as f: reader = csv.DictReader(f) for row_num, row in enumerate(reader): data.append(row) except FileNotFoundError: print(f"Error: File not found at {filepath}") except Exception as e: print(f"Error reading CSV: {e}") return data def _filter_and_process_rows(data: List[Dict[str, str]], value_column: str, threshold: float) -> tuple[List[Dict[str, Union[str, float]]], float]: """Filters rows and calculates the sum of a specific column.""" filtered_data = [] total_sum = 0.0 for row in data: try: value = float(row.get(value_column, '0')) # Default to '0' for safe float conversion if value > threshold: row[value_column] = value # Update value to float for consistency filtered_data.append(row) total_sum += value except ValueError: print(f"Warning: Could not convert value '{row.get(value_column)}' to float in column '{value_column}'. Skipping row.") continue return filtered_data, total_sum def _write_summary_csv(output_filename: str, initial_rows: int, filtered_rows: int, total_sum: float): """Writes summary metrics to a CSV file.""" try: with open(output_filename, 'w', newline='') as f: writer = csv.writer(f) writer.writerow(['Metric', 'Value']) writer.writerow(['Input Rows', initial_rows]) writer.writerow(['Filtered Rows', filtered_rows]) writer.writerow(['Total Sum of Filtered Values', total_sum]) print(f"Summary written to {output_filename}") except Exception as e: print(f"Error writing summary CSV: {e}") def process_data_deepseek(filepath: str, value_column: str, threshold: float): """Main function to process CSV data, filter, and summarize.""" data = _read_csv_data(filepath) if not data: return initial_rows = len(data) filtered_data, total_sum = _filter_and_process_rows(data, value_column, threshold) filtered_rows = len(filtered_data) output_filename = "processed_summary_deepseek.csv" _write_summary_csv(output_filename, initial_rows, filtered_rows, total_sum)

And my thoughts? DeepSeek Coder brilliantly broke the problem down into smaller, super manageable functions, which is phenomenal for readability and long-term maintenance. It also smartly injected type hints, something I always appreciate. The error handling was, well, tolerable; it opted to print warnings instead of straight-up raising exceptions, which I suppose is a design choice, a gentle one. But still, it was a rock-solid refactor within the tough constraints of not introducing any fresh libraries. It genuinely felt like a surprisingly thoughtful, meticulously structured approach, even if it totally lacked the radical dependency upheaval seen in the larger models.