Securing Sensitive Data for AI Agents
A guide on how to protect your sensitive data when using AI agents
January 9th, 2025
When we first started building Neosync, we decided to use SWR hooks for data fetching because it was fairly straight forward and what Next recommended. This was fine while our app was pretty small but as our app has grown and the number of hooks has increased, it started to become burdensome to manually write and manage all of our hooks. So I just spent the last few days switching out some of our frontend infrastructure that was manually written with fully auto-generated, typesafe TanStack React Query Hooks.
I was able to replace the majority of our backend-for-frontend (BFF) routes as well, which essentially acted as a thin passthrough layer to our standalone backend system.
Neosync is made up of a few different technologies and in this post, I'll go through the following:
Overall, our setup is much simpler and far less error prone. As a small team, it's imperative to have few moving parts and strive to have the system do the heavy lifting for you when it can.
Let's take a moment to run through the technologies present:
swr
for client-side fetching and powers all of our hooksswr
communicates solely with this layer and abstracts away the backend entirely.buf
to generate the server and client bindings.Because we use Nextjs, we take advantage of the API provided by the framework to handle things like authentication, along with server-side abstraction.
This allows us to, say, store an encrypted cookie with next-auth
in the browser and decrypt it on the server.
swr
ExampleLet's look at an example of what the existing setup looked like.
Jobs are async workflows that stream data from a source to a destination and anonymize it while it streams.
export async function GET(req: NextRequest, { params }: any) {
const { jobsApi } = getNeosyncApiClient();
return jobsApi.getJobs(params.accountId);
}
import useSWR from 'swr';
import { GetJobResponse } from '@neosync/sdk';
export function useGetJobs(accountId: string): HookResponse<GetJobResponse> {
const { data, isLoading, mutate } = useSWR(
!!accountId ? `/accounts/${accountId}/jobs` : null,
fetcher
);
return {
data: isLoading ? null : GetJobResponse.fromJson(data),
isLoading,
mutate,
};
}
export function Page() {
const { account } = useAccount();
const { data, isLoading } = useGetJobs(account.id);
}
This approach is fine, a little manual, but overall works well and gives us full control over what the frontend sees in terms of abstracting away how it communicates with our real backend server. The downside to this is that a frontend route must be manually created, there has to be a translation layer done on the clientside to get the data into the correct format.
Due to us using protos, classes are generated from these, which are not fully 1:1 with their JSON counterpart. This is mostly prevalent with enums
and oneof
types.
You could alternatively use the @connect/web
client and communicate directly with your backend, but there are also drawbacks to that. There is also no fully typed way of fetching those items.
The buf.build/connectrpc/query-es
buf plugin allows us to auto-generate structures that can be plugged almost directly into TanStack Query.
I say almost, because you will need to utilize the @connectrpc/connect-query instead, which is a light wrapper around TanStack Query.
However, when doing so, you'll be able to have fully type-safe hooks that you can use very easily. You can follow the guide for setting up the necessary connect and tanstack providers in your layout so that everything just works nicely.
import { getJobs } from '@neosync/sdk/connectquery';
import { useQuery } from '@connectrpc/connect-query';
import { useAccount } from '@/components/providers/account-provider';
export function Page() {
const { account } = useAccount();
const { data, isLoading, refetch } = useQuery(
getJobs,
{ accountId: account.id },
{ enabled: !!account.id }
);
return <div />;
}
That's it! There is no longer any need to create a separate hook (although you still can) or separate API Route to do what you need.
The connect-query library also makes it very easy to do mutations, which weren't as easy to do with swr
(as swr is intended for querying only).
import { getJobs } from '@neosync/sdk/connectquery';
import { useMutation } from '@connectrpc/connect-query';
import { useAccount } from '@/components/providers/account-provider';
export function Page() {
const { account } = useAccount();
const { mutateAsync: getJobsAsync } = useMutation(getJobs);
async function onClick(): Promise<void> {
// some user-defined button handler
const jobResponse = await getJobsAsync({ accountId: account.id });
}
return <div />;
}
Aha! you might be wondering how we are actually connecting to the Neosync backend? Well, that is done at the configuration of the tanstack query client and can be done two main ways:
I tested both of these setups and ultimately went with option #2. This requires minimal configuration, authentication continues to work, and I don't have to complicate my backend deployments with CORS support or anything like that.
swr
mutate vs @tanstack/react-query
refetchOne thing that was a bit awkward when switching from swr
to @tanstack/react-query
was being able to nicely mutate the cache and provide a new value to it.
With swr
, each hook provides a mutate
function that can be called as is: mutate()
which will refetch the backend. However, you can optimistically update the local cache by providing the response directly to that mutate method!
This is super handy for scenarios where the user has just created or updated a record and the response to that method gives you the latest object. We like to take this object and update the query cache directly so that the user doesn't have to wait for the refetch.
This is common practice at Neosync where a user creates a new resource via some workflow like /new/jobs
. The job is created and the user is routed to /jobs/:id
.
Before we route the user ot the /jobs/:id
page, we set the cache for that route with the newly created Job
resource. This is definitely an optimization, but it's little things like this that can really make a web app feel snappy and fresh.
swr
exampleimport { mutate } from 'swr';
export function CreateJobPage() {
const { account } = useAccount();
const router = useRouter();
async function onSubmit(values: FormValues) {
const newJobResp = await createNewJob(values); // hand written function that uses fetch
mutate(
`/api/jobs/${newJobResp.job.id}`, // we have to manually write the key, which is error prone
new GetJobResponse({ job: newJobResp.job })
);
router.push(`/jobs/${newJobResp.job.id}`);
}
return <div />;
}
async function createNewJob(values: FormValues): Promise<CreateJobResponse> {
// fetch to a locally created BFF route
return fetch({
url: '/api/jobs',
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(values),
}).json();
}
Pretty nice, but requires manual steps and is error prone due to having to write untyped route paths, among other things.
import { getJobs } from '@neosync/sdk/connectquery';
import { useMutation, createConnectQueryKey } from '@connectrpc/connect-query';
import { useAccount } from '@/components/providers/account-provider';
import { useQueryClient } from '@tanstack/react-query';
export function CreateJobPage() {
const { account } = useAccount();
const { mutateAsync: createJobAsync } = useMutation(createJob);
const queryclient = useQueryClient();
const router = useRouter();
async function onSubmit(values: FormValues) {
const newJob = createJobAsync({ ...values });
queryclient.setQueryData(
createConnectQueryKey(getJob, { id }),
new GetJobResponse({ job: newJob })
);
router.push(`/jobs/${newJob.id}`);
}
return <div />;
}
A little awkward with the queryclient, but is overall very typesafe and requires no magic strings for querying or mutating the data.
There are a lot of different ways to setup a frontend.
I detailed here one way to do so with swr
, and how we at Neosync did so for about a solid year, and how I've seen it done at past companies.
After coming across the @connectrpc/connect-query
project pretty recently, it finally gave me an excuse to try out tanstack query and is overall saving us a bunch of time from no longer having to write routes or their client-side fetch functions. Everything is now fully generated and typesafe with no more having to do any translations after the fact!
This setup can be gradually adopted and is easy enough to introduce for new code sections, or if you're looking to rewrite some sections of the code base that are isolated.
I did a PoC initially on an isolated part of the codebase, then mostly went through section by section and updated every hook.
I hope this is useful for folks and if you want to chat more, you can reach out to me on our discord as well as check out the implementation directly in the Neosync codebase.
A guide on how to protect your sensitive data when using AI agents
January 9th, 2025
Use Neosync to detect and redact PII in free-form text such as LLM prompts and other workflows
December 13th, 2024
Nucleus Cloud Corp. 2025