Founding Search Engineer

Brisbane, Australia

Full-time, Permanent

In-office

$180k-$230k + super + bonus

About the job

We help businesses respond to complex paperwork and questionnaires that companies send out when they're choosing vendors - the kind that used to take days or even weeks to complete.

In just 3 years, we've built technology that turns a 900-question form from a multi-day nightmare into a 2-hour task. Our rapid growth (10-20% every month) landed us on the Courier Mail's Unicorn Watch List and won Queensland AI Startup of the Year 2024. Now we're scaling quickly to become the undisputed leader in our category and have a vision of building the largest AI-first company in Brisbane.

Our customers span satellite operators bidding on new contracts, Silicon Valley tech firms managing security questionnaires, and Wall Street funds running due diligence checklists. Sales teams that used to dread these projects now look forward to them. One customer even said their team feels "joy" starting a new one, with another saying "AutoRFP.ai has been one of the most life changing tools that I've used in my career."

Read their stories → autorfp.ai/customer-stories

What makes this role interesting

Behind every answer our platform generates is a search and retrieval system that has to find the right information from thousands of past responses, across dozens of customers, in milliseconds. Get it right and a 900-question form practically fills itself. Get it wrong and every answer downstream is wrong too. This is the highest-leverage technical problem in our product - and as our first dedicated search engineer, you'll own it.

Today, our retrieval layer combines vector search with neural reranking - it's what powers the 900-question-to-2-hours transformation our customers rely on. That foundation works, and we're ready to invest in the infrastructure that takes it to the next tier of accuracy and scale. We have a clear direction for the next phase: a multi-phase pipeline with hybrid retrieval (BM25 + semantic search), phased ranking that graduates from cheap scoring to expensive cross-encoder reranking, and tensor-based chunk selection. But this is a direction, not a spec. We need someone who can bring fresh perspective to the design, challenge and refine our approach, and own the architecture as it scales.

You'll be doing meaningful systems work from the start - evaluating infrastructure options, standing up the indexing pipeline, building evaluation and diagnostic tooling, and evolving the production retrieval layer while keeping it running reliably for customers across 44 countries. The work is guided by a philosophy: diagnose before you optimise, ship incrementally, and stop the moment quality meets the bar. Each step produces data that informs whether the next step is worth doing.

Some weeks you'll be shipping a product feature end-to-end. Others you'll be deep in search quality - reviewing diagnostics, training a ranking model, evaluating it against the baseline. Some days you'll be debugging a production issue at the infrastructure level. Your changes ship to Fortune 500 companies in weeks, not quarters.

What you'll do

Own the search and retrieval architecture - evaluate the current design, identify the highest-leverage improvements, and make the key infrastructure and modelling decisions. The direction is set; how we get there is yours to own.
Build and operate the search infrastructure as it scales - from schema design and indexing pipelines through to cluster operations, embedding endpoints, and production reliability.
Design evaluation and diagnostic tooling - build the foundation that lets us measure and compound quality improvements with rigour.
Ship product features end-to-end - from defining the approach through to deployment and monitoring. You'll spend as much time building well-designed systems as you will tuning ranking quality.
Improve search quality through data, not intuition - instrument the pipeline, capture the right signals, and make iterative ranking improvements informed by production diagnostics. Know when to invest in a learned model and when simpler tuning is the better move.
Own the search roadmap - define the work, break it down, and ship incrementally rather than waiting for fully scoped tickets.
Raise the bar for the team - provide thoughtful code reviews, mentor engineers working in search-adjacent areas, and help shape how we hire as the team grows.

What good looks like

First week: Ship a PR to production. Meet the team. Start watching customer calls to build context on how people use the platform and where retrieval quality matters most.

First month: Own your first search improvement end-to-end, from identifying the opportunity through to production. You'll have context from the team on what matters and why - your job is to ship it well, measure it honestly, and ask early if something isn't clear. You'll be forming your own views on the target architecture and starting to make infrastructure decisions.

First 3 months: Consistently delivering search improvements that customers notice. You've stood up evaluation and diagnostic tooling that lets the team measure quality with confidence. You've built enough context to identify the highest-leverage problems in the pipeline and can take a problem that could take months and find what can be shipped incrementally. The evolved retrieval pipeline is taking shape - key architectural decisions are made and the first phases are running in production. You're making the people around you faster.

First 6 months: Driving the search and retrieval roadmap with real credibility earned through shipped work. You've delivered step-change improvements to retrieval quality - possibly including the first learned ranking models trained on production data - and can connect those improvements directly to customer and business metrics. You're shaping how we hire and helping set the standard for engineering quality across the team.

What we're looking for

This is a specialist role, so relevant domain experience matters more here than in our generalist positions. That said, this is also a Senior Engineer (IC3) role - you'll be expected to define your own work, ship incrementally, and multiply the effectiveness of the people around you, just like any senior engineer here. The domain expertise is layered on top of that baseline, not a substitute for it.

Strong engineer first, search specialist second - You'll spend as much time building well-designed systems and shipping features as you will tuning ranking quality. Our production stack is TypeScript/Node.js - you write clean code, you're comfortable with version control, testing, CI/CD, and deploying to production. You're not precious about whether a task is "backend", "data", or "ML".

Comfortable in Python for model training and data work - The ranking model training pipeline (linear regression, LightGBM, evaluation scripts, data analysis) lives in Python. You don't need to be a Python-first engineer, but you need to be genuinely productive in pandas, numpy, and ML tooling - not just "can read it."

Systems-level search engine experience - You've built, operated, or meaningfully contributed to a search engine deployment (Vespa, Elasticsearch, Solr, or similar) at the infrastructure level - not just called APIs. You understand how matching, scoring, and phased ranking work inside the engine, and you're comfortable with the operational side: cluster sizing, resource budgeting, schema design, and deployment.

Diagnostic instinct - You look at logs and metrics to form hypotheses before jumping to solutions. When something breaks in the retrieval pipeline, you narrow down the problem methodically - and you ask good questions before diving into the parts you don't yet know.

Quantitative rigour - You understand evaluation metrics, overfitting, and experimental design well enough to measure search quality honestly and not fool yourself with noisy results. If you've trained ranking models before, great - but what matters more is knowing how to tell whether a change actually helped.

We'd value depth in several of the following, but none are hard requirements - we care more about your ability to learn fast, ship well, and make others around you more effective:

Information retrieval concepts: BM25, vector similarity, hybrid retrieval, learning-to-rank, recall vs precision tradeoffs
Phased ranking architectures - understanding how to organise cheap-to-expensive scoring stages and why it matters at scale
Embedding models (sentence transformers, contrastive fine-tuning, binarisation/quantisation)
Cloud infrastructure (AWS), model hosting (SageMaker), and distributed systems
Search relevance evaluation tooling (Quepid or similar)
Monitoring and observability (Prometheus, Grafana)
Building LLM-powered applications, particularly RAG systems
Leading technical projects or mentoring engineers

Backgrounds that would be a strong fit

We're open-minded about where you're coming from. The strongest fit - and where you'd ramp fastest - is search infrastructure experience combined with learning-to-rank work. But these paths also map well:

Search infrastructure engineers who've built or operated Vespa, Solr, or Elasticsearch clusters and have exposure to ranking model work - you understand the systems context and can hit the ground running on the infrastructure build.

ML engineers from e-commerce, adtech, or content platforms - if you've worked on ranking, recommendations, or click-through prediction in production, the problem structure is very similar. You'll need to ramp up on the infrastructure and operational side, but the modelling intuition transfers directly.

Applied ML or NLP engineers who've worked with embeddings, fine-tuned language models, or built RAG pipelines - you have strong intuition for retrieval and representation.

Product-minded software engineers who've shipped user-facing features and want to go deeper on search and ML - you already know how to build reliable software and can pick up the domain. Best fit if you have some exposure to search systems or information retrieval concepts.

Academic background in information retrieval - if you've studied learning-to-rank or neural IR and want to apply it in production.

What won't you get?

Layers of approval before shipping
A narrowly scoped role where you only touch one part of the stack
Pre-chewed tickets with every detail figured out for you
Work that's disconnected from real customer outcomes

Details

Location: Brisbane - this is a full-time, in-office role
Type: Full-time, permanent
Salary: $180k-$230k base depending on experience + super + performance bonus

Further benefits

Equipment - Latest MacBooks, massive 49" 5k screens, and whatever else makes you effective
Compute & infrastructure access - The GPU and cloud resources you need to train models and run experiments without waiting for approval
Learning budget - Courses, conferences, books - if it makes you better, we'll cover it
Direct customer access - See how your search improvements affect real workflows
Global reach - Opportunities to visit customers worldwide as we expand
Great office, great location - Brisbane CBD, easy transport, a space people actually enjoy
High-impact work - Small team, real ownership, and features that ship to Fortune 500 companies

Apply for this role

Interested? Fill out the form below.