AI Code Generation Models: The Big List

Here is AI Business' big list of AI code generation models.

4 Min Read
artificial intelligence design on grid
Alamy

This article was originally published on AI Business.

ChatGPT wowed the public with its ability to generate text, with marketers and copywriters able to use it to aid their work. Emerging around the same time is another use for generative AI that could revolutionize software development: text-to-code.

Instead of painstakingly writing code line-by-line, developers may soon be able to simply describe what they want the program to do in natural language. AI systems like ChatGPT, Copilot and Ghostwriter will then take care of producing the required code.

AI Business explores the workings and abilities of these transformative AI systems that can code entire programs from the ground up based solely on text prompts.

Text to Code

What are text-to-code AI models?

Text-to-code AI models use machine learning to generate snippets of code or entire functions. These models are trained on vast amounts of public code and are designed to aid human developers.

Text-to-code AI models take natural language inputs – plain English – and can turn it into code.

Examples of Text-to-Code AI Models and Applications

StarCoder

Creators:

  • ServiceNow - Santa Clara-based enterprise workflows company

  • Hugging Face - Machine learning tools developer and home to one of the internet’s largest libraries of natural language processing AI models

First published: May 2023

StarCoder is a 15 billion-parameter AI model designed to generate code for the open-scientific AI research community.

StarCoder was trained on licensed data from GitHub spanning over 80 programming languages, and fine-tuning it on 35 billion Python tokens.

The resulting model outperforms Google’s PaLM 1 and Meta’s LLaMA at popular benchmarks despite its small size.

Access StarCoder: https://huggingface.co/bigcode/starcoder

Read more about Starcoder on AI Business: https://aibusiness.com/nlp/hugging-face-service-now-launch-coding-llm-starcoder

Codex

Creator: OpenAI - New York-based AI research lab backed by Microsoft

First published: August 2021

Codex is a code generation model that powers GitHub Copilot (see below).

Proficient in more than a dozen programming languages, Codex can interpret simple commands in natural language and execute them.

More on OpenAI’s code generation capabilities: https://platform.openai.com/docs/guides/code

Read more about Codex on AI Business: https://aibusiness.com/ml/openai-upgrades-codex-machine-learning-assistant-says-it-can-turn-natural-language-into-code

Copilot

Creators:

  • GitHub - Microsoft-owned code hosting platform

  • Microsoft - Software giant behind Windows, 365

  • OpenAI - New York-based AI research lab backed by Microsoft

First published: October 2021

Current version: Copilot X

Copilot is a generative AI coding tool. It can take text inputs and queries and turn them into coding suggestions across dozens of languages, including Python, JavaScript, TypeScript, Ruby and Go.

The latest iteration, unveiled in March 2023, supports voice recognition, GPT-4-powered tags in pull request descriptions and a ChatGPT-style interface where devs can ask questions about business documentation.

Read more about Copilot X on AI Business: https://aibusiness.com/verticals/github-supercharges-copilot-with-gpt-4-new-features

Code Interpreter

Creator: OpenAI - New York-based AI research lab backed by Microsoft

First published: July 2023

It is the only plugin on this list – Code Interpreter is an add-on for ChatGPT, enabling users to use ChatGPT to generate and execute code. Without it, ChatGPT can only generate code snippets.

Code Interpreter is only available to subscribers of OpenAI’s premium offering, ChatGPT Plus. Users can write and execute Python code as well as upload a file and ask ChatGPT to analyze data, create charts, edit files and perform math.

Read more about Code Interpreter on AI Business: https://aibusiness.com/nlp/openai-s-code-interpreter-lets-chatgpt-play-data-scientist

CodeT5

Creator: Salesforce - Enterprise cloud giant

First published: May 2023

CodeT5 is a large language model for code understanding and generation tasks.

CodeT5 is a pre-trained encoder-decoder model built to perform tasks including code defect detection and clone detection, as well as generation tasks.

Read the CodeT5 research paper: https://arxiv.org/pdf/2305.07922.pdf

Access the CodeT5 code: https://github.com/salesforce/CodeT5

Polycoder

Creators: Researchers from Carnegie Mellon University - the full list of authors are in the paper.

First published: May 2022

Built using OpenAI’s GPT-2 language model, Polycoder was trained on a dataset of 249GB of code spanning 12 programming languages. 

It was designed as an open source alternative to OpenAI’s Codex (see above). While not as powerful as some of the code generation models on this list, Polycoder surpasses Codex at writing code in the programming language C.

Read the Polycoder paper: https://arxiv.org/pdf/2202.13169.pdf

Access the Polycoder code: https://github.com/vhellendoorn/code-lms#models

Replit Ghostwriter 

Creator: Replit - San Francisco-based startup and software development platform

First published: October 2022

Replit’s answer to GitHub Copilot (see above), Ghostwriter is an AI-powered programming tool to aid developers building software.

Ghostwriter can complete code, providing users with suggestions, as well as explain to the user code in plain English. It can also re-write and generate code based on natural language prompts.

In February 2023, Replit launched Ghostwriter Chat, adding conversational AI capabilities to the mix.

Try Replit Ghostwriter: https://replit.com/signup

Tabine 

Tabine uses deep learning to aid code completions. Tabine supports over 50 programming languages and has a free one-user option.

For businesses, Tabine offers an enterprise offering, with the chipmaking giant Nvidia, Elon Musk’s rocket company SpaceX and sportswear maker Nike all using the platform.

Try Tabine: https://www.tabnine.com/install

Read more about:

AI Business

About the Authors

Ben Wodecki

Assistant Editor, AI Business

Ben Wodecki is assistant editor at AI Business, a publication dedicated to the latest trends in artificial intelligence.

AI Business

AI Business, an ITPro Today sister site, is the leading content portal for artificial intelligence and its real-world applications. With its exclusive access to the global c-suite and the trendsetters of the technology world, it brings readers up-to-the-minute insights into how AI technologies are transforming the global economy - and societies - today.

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like