Software Developer in Test (Python) – Senior

Remote $115k–$196k senior English B2 12 days ago full-time quality 8.6/10

Role in brief

SOFTSWISS is looking for a Senior Software Developer in Test with strong Python skills to enhance product quality, specifically focusing on AI and LLM systems. This role involves designing and automating test cases, building evaluation pipelines for AI, and contributing to CI/CD improvements. Candidates with a background in QA automation and an interest in AI testing should consider applying.

Apply now →

PythonPyTestCI/CDAI testingtest automation

About the role

This Senior Software Developer in Test role at SOFTSWISS centers on improving product quality through robust automation and testing, with a particular emphasis on artificial intelligence and large language model (LLM) systems. The position requires analyzing new feature requirements to define testing approaches, then automating these tests using Python and PyTest. A significant part of the work involves developing automated quality evaluation pipelines for AI, utilizing metrics and LLM-as-judge methods, and conducting adversarial testing to identify vulnerabilities.

The successful candidate will be responsible for testing various aspects of AI systems, including MCP servers, tool schemas, and agentic workflows, ensuring proper tool selection, multi-step reasoning, and error handling. Beyond AI-specific tasks, the role includes maintaining and enhancing the existing test automation framework, contributing to internal testing tools, and preparing comprehensive test documentation. This involves participating in test design, estimation, release testing, and overall product quality assessment.

Success in this position means actively contributing to CI/CD and QA process improvements, designing and maintaining evaluation suites for RAG and agentic flows, and setting up regression checks for changes in prompts and models. The role also tracks AI system quality alongside cost, latency, and token usage, utilizing tracing and observability tools to debug and optimize LLM application behavior. This requires a proactive approach to ensuring the reliability and performance of complex AI-driven products.

The salary for this full-time remote position ranges from $115,000 to $195,500 annually.

Skills that matter here

Python: This role requires strong Python skills for building and maintaining test automation frameworks and developing AI evaluation pipelines.
PyTest: Experience with PyTest is essential for automating test cases within the existing framework.
CI/CD: The role involves contributing to Continuous Integration/Continuous Deployment process improvements and integrating automated tests.
AI testing: A core focus of this position is on testing AI/LLM-based systems, including building evaluation pipelines and performing adversarial testing.
test automation: The candidate will be responsible for building and maintaining reliable test automation to enhance product quality.

Who this role suits

A candidate with 5+ years in Quality Assurance, including significant automation testing experience.
Someone with a solid understanding of QA principles, test design, and the SDLC.
An individual who is interested in testing AI/LLM-based systems, with a willingness to learn quickly in this area.
A professional familiar with testing non-deterministic systems and comfortable with concepts like RAG and LLM evaluation metrics.

From the employer

Analyze requirements and define the testing approach for new features and product changes
Automate test cases using the existing framework based on Python and PyTest
Build automated quality evaluation pipelines for AI systems using metrics and LLM-as-judge approaches
Test MCP servers, tool schemas and tool-call behavior, including edge cases and invalid arguments
Evaluate agentic workflows, including tool selection, multi-step reasoning, error handling, loop recovery and state correctness
Maintain and improve the test automation framework and contribute to internal testing tools, including mocks
Prepare and maintain test documentation, including checklists, test cases and quality reports
Participate in test design, estimations, release testing and product quality assessment
Contribute to CI/CD and QA process improvements
Design and maintain evaluation suites and golden datasets for RAG and agentic flows
Perform adversarial testing for AI systems, including prompt injection, jailbreaks, tool misuse and data leakage risks
Set up regression checks for changes in prompts, models, retrieval settings and chunking strategies
Track AI system quality together with cost, latency and token usage
Use tracing and observability tools to debug, measure and improve LLM application behavior

5+ years of experience in Quality Assurance, including both manual and automation testing
Solid understanding of QA principles, test design, test coverage, test pyramid and SDLC
Experience with Python-based test automation frameworks, such as PyTest, Behave or similar
Experience with CI/CD and monitoring or alerting tools, such as Datadog, ELK, Sentry or similar
Interest in testing AI/LLM-based systems. Hands-on experience is preferred, but we are also open to candidates who can learn quickly and want to grow in this area
Familiarity with RAG, LLM evaluation and quality metrics, such as groundedness, faithfulness, answer relevance and retrieval quality
Experience or interest in AI evaluation tools, such as RAGAS, DeepEval, promptfoo, LangSmith Eval, TruLens, Arize Phoenix or similar
Understanding of how to test non-deterministic systems, where there may be no single correct output
Familiarity with LangChain, LangGraph, MCP, vector databases, semantic search or LLM observability tools would be a strong plus
Good spoken and written English (B2 level or higher)

Private health insurance
Sports benefits
Comprehensive Mental Health Program
Free English lessons (online)
Local language courses
Paid time off
Maternity leave support
Referral program rewards
Upskilling, internal workshops, and participation in professional conferences and corporate events

Questions about this role

What is the seniority level for this position?

This is a senior-level position requiring 5+ years of experience in Quality Assurance.

What are the key technical skills required for this role?

Key technical skills include Python, PyTest, CI/CD, and experience or interest in AI testing and test automation frameworks.

How do I apply for this job?

The job posting links to the company's career page, which likely contains application instructions.

Apply now →

Similar jobs

Role in brief

About the role

Skills that matter here

Who this role suits

From the employer

Questions about this role

Similar jobs

Staff Development Experience Engineer

Principal AI Engineer

AI Native Platform Architect

Senior Risk Engineer

Platform Engineer - (Site Reliability Engineering)

Vice President Security Engineering (Data Centers)