Introduction

Ever wished you could just talk to your map data? 🗺️ Instead of wrestling with complex SQL queries, what if you could just ask, “Show me all the parks in New York,” and get an answer instantly? That’s exactly what we’re going to build in this tutorial! We’ll create a “Natural Language GIS” app. You ask questions in plain English, and it magically translates them into SQL to query a geographic database. It feels like magic, but I promise, we’ll peek behind the curtain and see how it all works. ✨

The Tech Stack

Before we jump into the setup, let’s quickly go over the main technologies we’re using. This project combines a few powerful tools to make the magic happen. While you could build this with other tools like Python and Streamlit, here’s the stack we’re excited about for this guide:

Bun

Bun is an incredibly fast JavaScript runtime, bundler, and package manager all in one. We use it to run our development server and manage dependencies.

SvelteKit

SvelteKit is a modern web framework that makes building the user interface a joy. It compiles your code into small, efficient JavaScript.

PostGIS

PostGIS is a powerful extension for PostgreSQL that adds support for geographic objects, turning our database into a geospatial powerhouse.

Prerequisites

Before we dive in, you’ll need a few tools installed on your machine.

Bun: For managing dependencies and running scripts. You can find installation instructions here.
Docker: We use Docker to run the Postgres database, as it’s the simplest way to get a PostGIS-enabled server running. You can get it from the official Docker website.

Prefer Dev Containers? If you’re a fan of VS Code Dev Containers, we’ve included a configuration for you! Just open the project in VS Code and click “Reopen in Container” when prompted. It handles all the installation for you.

Getting Started

Alright, let’s get this thing running!

1. Clone the Repository

First things first, we need the code. Open your terminal and clone the repository:

git clone https://github.com/supaworks/field-guide.git
cd field-guide/examples/natural-language-gis

2. Open in VS Code

Now, open the project folder in VS Code.

code .

3. Setup the Environment

Now, let’s configure the application.

Configure Environment Variables

We need a place to store our secrets, like API keys. We’ll create a .env file for that.

cd examples/natural-language-gis
cp .env.example .env

This copies the example file to a new .env file. Open .env and paste in your SUPA_API_TOKEN.

If you don’t have a key yet, you can create one for free at supa.works.

Install Dependencies

Next, let’s install the project’s libraries. We’re using Bun, a super-fast JavaScript toolkit.

bun install

Start the Database

Our app needs a database to talk to. We’ve defined it in a docker-compose.yml file. Let’s fire it up!

docker compose up -d

The -d flag runs it in “detached” mode, so it hums along quietly in the background.

5. Prepare the Database

An empty database is no fun. Let’s give our AI something to work with.

Create the Table Structure

Initially, our database is empty. This command reads our schema definition and creates the necessary tables.

bun run db:push

What's this all about?

So, our database is empty when we kick things off. We need some tables to get started. That’s why bun run db:push calls drizzle (a TypeScript ORM) and tells it to create some predefined tables for us. You can check out exactly what tables in src/lib/server/db/schema.ts.

Seed the Database

“Seeding” is just a fancy word for planting some initial data. We have some sample map data (shapefiles) in the data/ folder. This script will load them into our PostGIS database.

bun run scripts/seed.ts

What's this all about?

Now that we’ve got our tables, we need to fill them with content. This example uses some open data provided by Stadt Leipzig. In scripts/seed.ts, you can see that the script now takes the shape files and adds them to our database. This really depends on your data.

Generate AI Context

This is a crucial step. How does the AI know what tables and columns are in our database? It can’t just guess! This script scans the database and creates a “map” of its structure for the AI to use.

bun run scripts/generate_context.ts

What's this all about?

Now that we’ve got a database with columns and real content, we need to describe it so an LLM can understand your data. You have to remember, even though LLMs are smart, they can’t figure out your data on their own. That’s why I made this script—to help extract some data and label it for the LLM, so it knows what it’s all about. This also depends on your data and how you plan to use it.

6. Run the App!

And now, the moment of truth. Let’s start the development server.

bun run dev

Open your browser to http://localhost:5173. You should see the Natural Language GIS interface, ready for your questions! 🎉

How It Works

So, how does typing “Show me all bus stops” get you a map of bus stops? It’s not magic, it’s a clever conversation with an AI.

What is the LLM doing with my prompt?

When you type a question, we don’t just send your raw text to the Large Language Model (LLM). We wrap it in a System Prompt, which is like giving the AI a cheat sheet for the exam. The System Prompt looks something like this:

“You are a world-class PostGIS expert. Your job is to translate a user’s question into a valid SQL query. Here is the schema of the database you are working with: The table ‘bus_stops’ has columns ‘name’ (text) and ‘geom’ (geometry)… Now, here is the user’s request: …”

The LLM uses this “context” to figure out that “subways” probably refers to the bus_stops table.

How does the LLM know what to query?

That’s what our scripts/generate_context.ts script did! It’s like a scout that surveys the landscape of your database.

It connected to your database.
It listed all the tables (le_streets, bus_stops, etc.).
It inspected the columns and their data types for each table.
It saved this “map” of your database into a text file.

When you ask a question, we inject this map into the prompt. The LLM isn’t guessing; it’s using the information we provided to make an educated decision.

How can I add my own data?

This is where it gets really fun! Want to map your own city or data?

Find a Shapefile: A .shp file is a common format for map data. You can find them on government data portals or sites like OpenStreetMap.
Add it to data/: Place the shapefile and its related files into the examples/natural-language-gis/data/ folder.
Update the Seed Script: Open scripts/seed.ts and add a few lines to load your new file into a new table.
Re-run Context Generation: Run bun run scripts/generate_context.ts again. This tells the AI about your new table, so you can start asking questions about it!

How can I improve the results?

Sometimes the AI might write a query that’s not quite right. Here are a few ways you can help it improve:

Better Column Names: The clearer your column names, the better. population_2023 is much easier for the AI to understand than pop23.
Add Comments: You can add comments to your tables and columns in the database. Our context generator is smart enough to read them and pass them to the AI!
Few-Shot Prompting: This is a fancy term for giving the AI examples. You can modify the system prompt to include a few examples of a good question and the perfect SQL query that answers it.

Getting started

Examples

Building a Natural Language GIS App

Introduction

The Tech Stack

Bun

SvelteKit

PostGIS

Prerequisites

Getting Started

1. Clone the Repository

2. Open in VS Code

3. Setup the Environment

5. Prepare the Database

6. Run the App!

How It Works

What is the LLM doing with my prompt?

How does the LLM know what to query?

How can I add my own data?

How can I improve the results?

Getting started

Examples

​Introduction

​The Tech Stack

Bun

SvelteKit

PostGIS

​Prerequisites

​Getting Started

​1. Clone the Repository

​2. Open in VS Code

​3. Setup the Environment

​5. Prepare the Database

​6. Run the App!

​How It Works

​What is the LLM doing with my prompt?

​How does the LLM know what to query?

​How can I add my own data?

​How can I improve the results?

Introduction

The Tech Stack

Prerequisites

Getting Started

1. Clone the Repository

2. Open in VS Code

3. Setup the Environment

5. Prepare the Database

6. Run the App!

How It Works

What is the LLM doing with my prompt?

How does the LLM know what to query?

How can I add my own data?

How can I improve the results?