Overview
This guide explains how to choose and configure AI Tables columns for company research, people discovery, contact enrichment, and scoring. It is meant to be actionable on its own: each supported column kind below includes the minimum working shape, required context, and the most common mistake.Prerequisites
Endpoints used
- Create columns (
POST /v1/tables/:table_id/columns) - Update column (
PATCH /v1/tables/:table_id/columns/:column_id) - Delete column (
DELETE /v1/tables/:table_id/columns/:column_id) - Run table (
POST /v1/tables/:table_id/run) - Get table (
GET /v1/tables/:table_id) - Get table data (
GET /v1/tables/:table_id/data)
Column Guide in 60 Seconds
A column is one field Extruct fills for each row. Every column haskind, name, and key.
Some kinds also have value.
If kind is agent, you also choose agent_type, output_format, and a prompt.
Typical shape:
input,email_finder, andphone_finderdo not need avalueblockreverse_email_lookupuses top-levelemail_column_keyinstead ofvalue
Column kinds at a glance
All column kinds requirekind, name, and key.
| Kind | What it does | What you provide |
|---|---|---|
input | stores a value you already have | the row value |
agent | researches, reasons, or transforms one field | agent_type, prompt, output_format |
company_people_finder | finds people at each company by role | roles |
email_finder | finds a work email for a person | nothing extra |
phone_finder | finds a phone number for a person | nothing extra |
reverse_email_lookup | resolves profile data from a known email | email_column_key |
If kind = agent
agent is the core Extruct pattern. After choosing kind: "agent", you choose the agent behavior, the output shape, and the prompt.
| Choice | What it controls |
|---|---|
agent_type | how Extruct gets or produces the answer |
output_format | the shape of the returned result |
prompt | the exact job the column should do |
Agent types at a glance
| Agent type | Best for | Not best for |
|---|---|---|
research_pro | researched answers on company tables | people or generic tables, direct contact lookups, LinkedIn fetch from a known URL |
research_reasoning | researched answers on people and generic tables | company-table use cases that need company-context disambiguation |
llm | transforming data already in the row | web research or source discovery |
linkedin | LinkedIn data from a known LinkedIn URL | discovering the LinkedIn URL in the first place |
Output formats at a glance
| Format | Use for | Extra required fields |
|---|---|---|
text | prose or notes | none |
url | one canonical URL | none |
email | one email-shaped answer from an agent | none |
select | exactly one allowed option | labels |
multiselect | zero or more allowed options | labels |
numeric | counts and measured values | none |
money | structured financial values | none |
date | exact or partial dates | none |
phone | one phone number in structured form | none |
grade | bounded scoring with explanation | none |
json | nested schema-defined output | output_schema |
Fast defaults
- On
companytables, start with the official website domain or URL asinput - On
companytables, default tokind: "agent"andagent_type: "research_pro" - On
peopleandgenerictables, usekind: "agent"andagent_type: "research_reasoning" - On
companytables, chooseresearch_reasoningonly if you intentionally want to skip the company-context disambiguation step - For scoring, use
output_format: "grade" - For people or contact workflows, prefer
company_people_finder,email_finder,phone_finder, orreverse_email_lookupover generic prompts - When you add new columns, rerun only the new column IDs with
mode: "new"
Choose the right column kind
Use the simplest column kind that matches the result you want back.input
What it does:
- stores a value you already have
- company website domain or URL
- manual notes
- internal tags or metadata
- local helper fields such as
company_websiteon a standalonepeopletable
- none; you provide the value directly in row
data
- on
companytables, the preferredinputis the company’s official website domain or URL - a raw company name is a weaker fallback when you do not have the site
- exactly the row value you send
- using weak company names as the default company input when you already have a domain or URL
agent
What it does:
- researches, reasons, or transforms one field
- research a company fact
- classify a company or person
- summarize upstream evidence
- return structured output such as
select,money,grade, orjson
- on
companytables, Extruct auto-injects company context - on
peopletables, Extruct auto-injects person context - on
generictables, use explicit prompt references for the row fields you need
- one value in the output format you choose
- one prompt tries to do multiple jobs or references
{input}unnecessarily oncompanyorpeopletables
company_people_finder
What it does:
- finds people at each company by role
- branch from company research into a people workflow
- find leadership, decision-makers, or role-based contact targets
- a
companytable with the correct company rows
- fills the finder cell on the company table
- creates or updates a child
peopletable for downstream enrichment
- broader role families for coverage:
sales leadership,engineering leadership - exact titles for narrower targeting:
CEO,Head of Engineering
- adding it to a
peopletable or using overly narrow role strings before checking coverage
email_finder
What it does:
- finds a work email for a person
- enrich a
peopletable after person rows already exist
full_nameprofile_urlcompany_website
company_website can come from:
- a parent company row, or
- a local
people-table input column keyed exactlycompany_website
- one work email or no result
- adding the column before
company_websiteexists or trying to configure it with avalueblock
phone_finder
What it does:
- finds a phone number for a person
- enrich a
peopletable after person rows already exist
full_nameprofile_urlcompany_website
company_website can come from:
- a parent company row, or
- a local
people-table input column keyed exactlycompany_website
- one phone number or no result
- expecting high coverage without first verifying that the person rows and company website context are strong
reverse_email_lookup
What it does:
- resolves person or profile data from an email you already have
- you already have an email column and want profile resolution rather than company-to-people branching
- a source email column keyed in
email_column_key
- person or profile metadata such as name, headline, location, and profile URL
- putting
email_column_keyinsidevalueor using this when the real job is company-to-people discovery
Agent types
Agent types change what Extruct can do for a customagent column.
research_pro and research_reasoning can both research, reason, and make judgments.
The difference is whether Extruct first disambiguates the requested company using company context before answering.
| Agent type | Best for | What it actually does |
|---|---|---|
research_pro | researched answers on company tables | web research plus Extruct DB search and similar-company retrieval, with a company-context disambiguation step |
research_reasoning | researched answers on people and generic tables | the same general research tool class, but without the company-context disambiguation step |
llm | row-local transformation | transforms existing row data only; no web research |
linkedin | LinkedIn data from a known LinkedIn URL | uses LinkedIn company or profile lookup behavior |
- fetch LinkedIn data from a known LinkedIn URL
- find people at a company by role
- find work emails
- find phone numbers
- resolve a person from an email
linkedincompany_people_finderemail_finderphone_finderreverse_email_lookup
research_pro
Use this for researched answers on company tables.
What makes it different:
- it uses the same general research and reasoning capability as
research_reasoning - before answering, it gathers or confirms company context
- it uses that company context to disambiguate whether the evidence belongs to the requested company or to some other company
- pricing and packaging
- funding or valuation
- target segments
- recent news
- product or use-case research
- competitor or alternative discovery
- canonical company URLs such as LinkedIn or careers pages
- this is the safe default for custom
agentcolumns oncompanytables - use it unless you intentionally want to skip the disambiguation step and you know the entity is already clear
research_reasoning
Use this for researched answers when you do not want the company-context disambiguation step.
What makes it different:
- it uses the same general research and reasoning capability as
research_pro - it does not start with the company-context disambiguation step
- it is the default research agent on
peopleandgenerictables - on
companytables, it is an expert choice when the entity is already clear and you intentionally want to skip the disambiguation step
- official-site disambiguation
- custom 1-5 scoring
- SMB vs enterprise judgments
- ranked comparisons
- researched answers on
generictables
llm
Use this for row-local transformation only.
What to expect:
- no web research
- no page visits
- no source URLs
- cheaper downstream transformation after a research step already happened
- turn notes into a
select - summarize several upstream columns
- normalize messy evidence
- derive structured
jsonfrom row context
linkedin
Use this when the row already has a known LinkedIn company or person URL and you want LinkedIn-derived data.
Typical chain:
- use
research_proto find the right LinkedIn URL - use
linkedinto fetch LinkedIn data - use
llmto summarize or classify that data
people-table rule:
- custom
agentcolumns onpeopletables can usellm,research_reasoning, andlinkedin research_prois not supported on customagentcolumns onpeopletables
Built-in company and people context
company tables
company tables automatically create and fill:
company_profilecompany_namecompany_website
people tables
Custom agent columns on people tables get baseline row context automatically:
full_nameprofile_urlrole
Output formats
Choose the output format that best matches how the result will be used later.| Output format | Best for | Example |
|---|---|---|
text | short prose or notes | company description |
url | one canonical URL | careers page |
email | one email address | normalized email |
select | one allowed option | pricing model |
multiselect | zero or more allowed options | markets served |
numeric | counts or measured values | employee count |
money | structured financial values | annual revenue |
date | exact or partial dates | founded date |
phone | one phone number | direct phone |
grade | bounded scoring with explanation | ICP fit |
json | nested structured output | competitors list |
- use
textfor short prose meant to be read directly - use
selectormultiselectwhen the answer should stay within labels - use
moneyfor revenue, ARR, valuation, or funding - use
gradefor scoring - use
jsononly when the structure is genuinely useful downstream
agent columns, the main result lives in rows[].data.<key>.value.answer.
What changes by output_format is the shape of answer. Supporting fields such as explanation and sources may sit alongside it in rows[].data.<key>.value.
text
Use this for prose or short notes that will be read directly.
Minimum config:
- use
textonly when the result is meant to be read directly; if you need filtering or sorting later, prefer a structured format
url
Use this for one canonical URL.
Minimum config:
- ask for the single best URL only; do not ask for a list unless you actually need a
jsonstructure
email
Use this when you want one email-shaped answer from a custom agent.
Minimum config:
- if the real job is finding a work email for a person, prefer
email_finder
select
Use this for exactly one allowed option.
Minimum config:
labels is required in the column config.
Typical returned shape:
- labels should be stable categories, not free-form text or sentence-length answers
multiselect
Use this for zero or more allowed options.
Minimum config:
labels is required in the column config.
Typical returned shape:
- use
multiselectonly when multiple labels can legitimately be true at the same time
numeric
Use this for counts, ranges, or measured values.
Minimum config:
- prefer
numericovertextwhen the result may be sorted, filtered, or bucketed later
money
Use this for structured financial values such as revenue, ARR, valuation, or funding.
Minimum config:
- prefer
moneyover prose for any financial figure you may compare or score later
date
Use this for exact dates, partial dates, or date ranges.
Minimum config:
- partial dates are valid and expected; do not force full day-level precision when the source does not support it
phone
Use this for one phone number in structured form.
Minimum config:
- if the real job is finding a phone number for a person, prefer
phone_finder
grade
Use this when you want a bounded score plus an explanation.
The recommended scoring pattern is a custom agent column with output_format: "grade".
Example:
5: strong yes4: likely yes3: mixed, partial, or uncertain2: likely no1: strong noNot found: not enough evidence to score confidently
Not found shape:
json
Use this for user-defined nested output.
Minimum config:
output_schema is required.
Typical returned shape:
- unlike the fixed formats above, the shape of
answeris defined by youroutput_schema
Dependencies and prompt writing
For customagent columns, dependencies come from:
- prompt references such as
{pricing_notes} extra_dependencies
extra_dependencies only when you intentionally need an upstream dependency that is not referenced in the prompt.
Dependency example:
- each column should do one job
- say exactly what artifact should come back
- prefer natural-language prompts on
companyandpeopletables - use explicit prompt references only when a column truly depends on another custom field
Troubleshooting
Unresolved column references
Cause:- the prompt references a missing column key
- create the source column first
- or change the prompt to reference an existing key
Column is not compatible with the table kind
Common examples:company_people_finderon apeopletableresearch_procustomagentcolumns on apeopletable
Cells never start running
Cause:- upstream dependency cells are not done
- inspect the source columns first
- keep dependency chains short
- rerun only the new column IDs with
mode: "new"
Results are hard to reuse downstream
Cause:- too many
textoutputs
- move repeated decisions into
select,multiselect,numeric,money,date,phone,grade, orjson
A column feels too expensive
Cause:- web research is being used for a transformation problem
- research the source fact once
- derive downstream fields with
llm
Recommended defaults
If you are unsure:- default to
companytables - default to official website domain or URL as company-table
input - default to built-in or purpose-built column kinds when available
- default to
research_prooncompanytables - default to
research_reasoningonpeopleandgenerictables - on
companytables, switch toresearch_reasoningonly when you intentionally want to skip company-context disambiguation - default to
llmfor downstream transformations - default to
agent+gradefor scoring - default to plain natural-language prompts on
companyandpeopletables - add explicit prompt references only for intentional chaining or on
generictables