c6d6b19d63bada158e7be253b890d7caca7b24ec

Hello, this is Sasaki, an AI engineer.

Will the era of programming in natural sentences come?
OpenAI is trying to create an AI model that reminds us of such a future.

Last year, Codex, an engine that automatically generates code, was added to one of the variations of the GPT-3 engine, a large-scale natural language processing model.
At the time of writing the manuscript, it is still in Private Beta and is being tested by beta testers around the world, and the model continues to be improved.

This time, we will introduce the video of code generation by Codex, a programming tool of the future.
The programming languages introduced in the video are JavaScript and Python.
Since the commentary is provided by audio, please listen with the sound turned on.

JavaScript

Generate code with the Codex JavaScript Sandbox.
Enter natural text and the code will be generated and executed.

js_playground

hello

Display "Hello" in a natural sentence, and try changing the font size, color, position, and language.

ball that changes color

Create a ball that bounces in a window and changes color with each bounce.

Creation of homepage top site

Give the same instructions in Japanese and English and compare the results.

One Sentence to Code

  • button to roll the dice
  • wall clock
  • How the ball hits the ground
  • the sun and the moon
  • block breaker game
  • Make a snowstorm on a black background
  • foam art

Python

For Python code generation, install and experiment with the Codex Plug-In in Jupyter Lab.

$ pip install jupyterlab-codex

This Plug-In allows you to seamlessly call the Codex API from your Jupyter Notebook.

plugin-2

Basic operation ①

  • Displaying "Hello World!"
  • Packageインポート(Numpy, Pandas, Matplotlib)
  • addition from 1 to n
  • Generate and display Fibonacci numbers (0,1,1,2,3,5,8,13,21,34,55.........)
  • volume of a sphere
  • Generates a null vector of size 10, but an array of 1s only for the 5th
  • Generate N random numbers and make a histogram
  • Graphing the highest temperature in August 2021 in Yokohama City
  • Calculations with a large number of variables

Hover over the cell containing the instruction and click the Codex Plug-In button at the top to generate code with Codex.

Basic operation ②

  • String manipulation (remove 'e', lower case, vowel manipulation)

numpy-100 challenge

Finally, a challenge to numpy-100. numpy-100 is 100 exercises for Python's numerical computation library numpy.
It is published on Github, and it is thought that it is also used as training data for Codex.

numpy-100
https://github.com/rougier/numpy-100

Divided into 3 levels of difficulty, ★☆☆ 35 questions ★★☆ 29 questions ★★★ 36 questions are prepared for each difficulty level.
For example, a problem like this.

example)
9. Create a 3x3 matrix with values ranging from 0 to 8 (★☆☆)
30. How to find common values between two arrays? (★☆☆)
43. Make an array immutable (read-only) (★★☆)
68. Considering a one-dimensional vector D, how to compute means of subsets of D using a vector S of same size describing subset indices?

It is a practice problem with good questions, and I feel that level ★★★ is quite a difficult question.
This time I let Codex solve the levels ★☆☆ and ★★☆.
Answers that were answered correctly within 5 trials were considered correct.
We will omit the video here and only report the results.

Level ★☆☆

d0a65898d84e5df97b605cb71263db41

Level ★★☆

f5c2a494fa305a6394c34c0021acd2a2b33d3e89

Aggregate results

numpy-100-results

Some non-Python code generation issues are also included, so they are excluded.
Exact match check of code is judged as a difference even if there is only a difference in variable name, and if there is only one correct answer no matter how you think about it, for example, package import problem etc., it is not considered as an exact match. I am making a decision.

As a result, the correct answer rate was 96.6% with 1 incorrect answer for the level ★☆☆, and there was no perfect code match.
Level ★★☆ had 9 incorrect answers and an accuracy rate of 69%. There were also three exact code matches. From the above results, it is considered that the learning of the Python code that follows the instructions in the natural sentence is progressing, but on the other hand, it is possible that the model is overfitting the problem sentences for the difficult and rare problems. I thought there might be. However, there are a certain number of rare questions that cannot be answered correctly, so I don't think it's possible to say that simply. A little more in-depth consideration is needed here.

in conclusion

This is my subjective impression, but I find it very interesting as a code generation experiment.
I think there is still room for improvement, but I felt a great deal of potential for future progress.
In a few more years, the era of programming by combining natural sentences may come.

Also, I think that even in its current state, it can be an auxiliary tool for programming.
Coding is done by people, and when I can't come up with a good idea, I feel like asking Codex.
I also found the results obtained by abstract instructions very interesting.

How did you like the practical GPT-3 series delivered in a 3-part series?
I would appreciate it if you could help me as a source of information for everyone.

Part 1: Practical GPT-3 Series (1) Creation of Ad Generator
Part 2: Practical GPT-3 Series (2) Does fine-tuning improve accuracy?

Until the end Thank you for reading.

Hiroshi Sasaki

Macnica AI Engineer Blog Related Articles

 



Macnica offers implementation examples and use cases for various solutions that utilize AI. Please feel free to download materials or contact us using the link below.

▼ Business problem-solving type AI service that utilizes the data science resources of 25,000 people around the world Click here for details