Creating a Real-Time Document RAG with VAST InsightEngine - Part 3

Narrow down by specifying conditions

現在2184件がヒットしています。check

Basic AI/Artificial Intelligence VAST Data

We held a webinar explaining the content of this article. Please register using the form below to receive a URL to access the on-demand video.

If you missed it or if you were a participant and would like to watch it again, please register now!

Click here to watch on-demand video

In the previous article, we created a simple real-time RAG application using VAST InsightEngine.
On the other hand, considering real-world use cases, access restrictions will be necessary on a per-user or per-group basis at the company or organizational level.
For example, there is data that employees at the staff level should not see, such as individual performance reviews, and data that only managers can see.
When creating a RAG application, it's crucial to properly manage permissions for this data. (It would be a major problem if an employee could ask about and see other people's performance reviews, or if an intern could access confidential company information.)

Application complexity due to permission settings

Implementing this level of privilege separation in traditional systems requires a complex design. The following patterns are possible for achieving privilege separation in a RAG system:

1. Create a vector database for each tenant and create an access design for the user or organization.

2. Store tenant information in a single vector database and perform filtering during vector searches.

　
Create a vector database for each tenant and create access designs for users or their affiliated organizations.

When adopting this type of permission structure, the configuration will be as follows: A vector database schema is created for each tenant, and access is designed to be handled on a tenant-by-tenant basis.

This method manages data per schema, making it relatively easy to implement on the application side. Furthermore, because the databases are separated, it's very easy to manage in terms of operation.

However, a drawback is that schema management becomes more complex as the number of tenants increases. Furthermore, the configuration becomes even more complex when you try to assign access permissions to each user within a tenant. Also, implementing cross-tenant searches becomes more complicated.

Tenant information is stored in a single vector database, and filtering is performed during vector searches.
When using this configuration, access restrictions are implemented by filtering by tenant information during vector searches.
Therefore, vector databases consist of only a single schema, making management very easy. However, because permissions cannot be changed dynamically, index management becomes complex when the number of tenants becomes large. For example, if teams merge or separate at the start of a new fiscal year, the information in the schema needs to be rewritten each time.

Permissions for Vector Databases using VAST Data

VAST InsightEngine creates and uses a vector database within VAST Data, allowing you to configure permissions for the vector database within VAST Data. Furthermore, VAST Data can apply column-based permission settings to table data based on the permissions of the VAST S3 bucket.
In other words, it becomes possible to set the same access permissions for vector data created from raw data as for the raw data itself. Since the permission settings for the original data itself can be dynamically changed from the VAST Data side, it becomes possible to dynamically change the permissions for the vectors.

This feature offers several advantages, including the ability to manage data within a single schema and dynamically change vector database rights, significantly reducing the effort required to build traditional RAGs and lowering operational burden.

This time, we will use this method to build a vector database for VAST Data. The construction method is simple: just add a column called vastdb_s3_path_auth to the table. A column with this name is already registered as a reserved column in VAST Data. If you enter the path (S3 bucket/key) of the data stored in the S3 bucket here, access restrictions will be applied according to that path.
Here, we will actually use the following sentences.

general_info.txt

Macnica, Inc., headquartered in Kohoku-ku, Yokohama, Kanagawa Prefecture, is a company that operates two businesses: VAD (Value Added Device) business, which deals with semiconductors and cybersecurity, and cyber-physical system solutions business. It is generally classified as a specialized trading company.

company_info.txt

Macnica Clavis Company primarily deals with NVIDIA, VAST Data, Texas Instruments, and Renesas.
Macnica Altima Company primarily deals with Intel and Altera products.
Macnica networks Company handles security products such as CrowdStrike, Box, and Exabeam.

Create an authorized vector database using VAST Data.

general_info.txt is stored in the macnica bucket, which contains general information for Macnica as a whole, while company_info.txt is stored in the macnica-tec bucket, which can only be viewed by internal members.
Now, I will create a database to store this data.
The columns in the vector database are a text vector (vec column), text (sentence column), and vastdb_s3_path_auth.

I've attached sample code for inserting data into this vector database.
In fact, the RAG system requires you to prepare text that is tailored to your existing content and then vectorize it.

import pyarrow as pa import vastdb DB_ENDPOINT = "http://10.0.50.70" DB_ACCESS_KEY = "バケットのアクセスキー" DB_SECRET_KEY= "バケットのシークレットキー" DB_BUCKET_NAME = "DBのバケット名" SCHEMA_NAME = "SCHEMA名" TABLE_NAME ="TABLE名" ##DBと接続を取る session = vastdb.connect     (     endpoint=DB_ENDPOINT,     access=DB_ACCESS_KEY,     secret=DB_SECRET_KEY     ) #insertするデータの形式を規定 #2048次元ベクトルとその文章を入れるデータ形式 ##ベクトルデータをvecとして、文章データをsentenceとして保存 dimension = 2048 columns = pa.schema([                     ("vec", pa.list_(pa.field(name="item", type=pa.float32(), nullable=False), dimension)),                     ('sentence', pa.string()),                     ('vastdb_s3_path_auth',pa.string())                     ]) #インサートするデータを作成 #ベクトルデータ二つのダミーデータ vector = [[0.1]*2048,[0.2]*2048]  #文章二つのダミーデータ sentence = ["Hello VAST!","Hello, InsightEnigne!"] #S3バケットとそのキー。ダブルクォーテーションマークまで入れる。 s3_path = ['"macnice/general_info.txt"','"macnica-tec/company_info.txt"'] #DBに挿入するデータを作成 data = [vector,sentence,s3_path] datas = pa.table(schema=columns,data=data) #実際に繋いでベクトルを挿入 with session.transaction() as tx:     bucket = tx.bucket(BUCKET_NAME)     #新規schema作成     schema = bucket.create_schema(SCHEMA_NAME)     #新規table作成     table = schema.create_table(TABLE_NAME,columns)     table.insert(datas)

Try RAG in real life

Let's actually try out the RAG created with these permission rules.
Create policies within VAST Data that allow access to only the macnica bucket and to both macnica-tec and the macnica bucket (named query_policy and query_policy_all, respectively). Then, apply the policy that allows access only to the macnica bucket to the Normal group, and apply the policy that allows access to both macnica-tec and the macnica bucket to the macnica group.

query_policy

query_policy_all

Group Information

In this state, we actually switch groups and submit RAG questions. If query_policy is applied, only the information in general_info.txt will be reflected, and if query_policy_all is applied, the information in company_info.txt will also be applied.

Benefits of building a RAG system with VAST InsightEngine

In this three-part series, we explained how to build a RAG system using VAST InsightEngine. VAST InsightEngine has all the necessary systems for traditional RAG applications. In particular, it can handle permission management, which is often the biggest bottleneck in applications. By using this functionality, you can accelerate the development of RAG applications and reuse unused data.