[Trying out AI persona marketing with NVIDIA DGX Spark]
Episode 1: Overview of the Demo
Episode 2: Selection of the Subjects to be Investigated
Episode 3: Question Generation
Episode 4: Generating and Collecting Responses
Episode 5: Visualizing the Survey Results
Episode 6 Summary
Summary
Thank you for reading this far. Here's a summary of what we've covered so far.
- We were able to achieve automated survey responses using AI personas by running the process entirely locally on a relatively inexpensive platform called DGX Spark.
- Processing time was also significantly reduced by increasing the parallelism of requests to the LLM.
- We were able to demonstrate that survey question generation is also possible with LLM.
- On the other hand, I believe further verification is needed regarding the consistency of the responses and the validity of the persona's influence on the responses.
- In this demo, we used the Nemotron-Personas-Japan dataset as is to create persona data. However, considering its effectiveness in a corporate setting, I believe it would be better to use the Nemotron-Personas-Japan dataset as a seed to generate customized datasets tailored to the specific purpose, or to generate datasets using each company's own marketing data as a seed.
Creating a custom dataset
NVIDIA offers NVIDIA NeMo Data Designer as a tool for efficiently creating custom datasets. This tool is a framework for synthetic data generation that combines multiple generation backends, templates, validators, and structured schemas. This configuration offers the following advantages compared to simple text generation:
- Structured generation (Jinja templates + Pydantic validation)
- By conforming the output to a schema, it is possible to output high-quality data in JSON/CSV format that does not require post-processing.
- Supports multiple backends
- By combining multiple models (e.g., GPT-OSS-120B or probabilistic graphical models), it is possible to achieve both statistical basis and natural language generation.
- Retry and automatic verification features
- Mechanisms are built in to ensure quality even when generating large quantities of data, including automatic retries, validation rules for generated content, and consistency checks of the syntax schema.
- Generation based on real-world distribution
- Generating logic based on statistical distributions allows for the reproduction of data distributions closer to reality than simple prompt generation.
Source: Synthetic data generation using NVIDIA NeMo Data Designer with Nemotron-Personas-Japan
Marketing data held by various companies may contain personal information, making NVIDIA NeMo Data Designer, which can be run locally (on-premise), ideal from a data protection standpoint.