Titre du poste ou emplacement

Web Crawling and Scraping Engineer

TinEye
Toronto, ON
Posté hier
Détails de l'emploi :
Télétravail
Temps plein
Expérimenté
Avantages :
Assurance maladie

This is an in-office position in our downtown Toronto office.

About TinEye

TinEye (https://tineye.com) is an image search company. We're experts in computer vision, pattern recognition, neural networks, and machine learning. Our mission is to make images searchable.

We have built a small, tight-knit and exceptional team based in Toronto. We deliver image search and recognition solutions to industries where searching images is mission-critical. Today, millions of people use TinEye. Our image search technologies power billions of searches across a wide range of industries. We are privately owned, profitable and founder-led, looking for a Web Crawling and Scraping Engineer to join our team.

This role is for a hands-on engineer who thrives on solving complex challenges. If you're driven by building robust systems and high-throughput data pipelines, this is your chance to shine.

We value ingenuity, hard work and problem-solving over pedigree. We expect every team member to solve challenging technical problems. We also believe that experience comes in many shapes and forms, so if you have the qualities to make you an excellent addition to our team, please apply to start a conversation. If you have any questions, reach out to us at [email protected].

As part of our team, you will be:

  • Writing and optimizing web crawlers, developing strategies to overcome site-specific challenges such as bot detection, CAPTCHAs, dynamic content loading, and evolving JavaScript frameworks.

  • Architecting ever-changing crawling strategies to prioritize and index web pages based on client requests rapidly.

  • Maintaining our high-throughput crawling and ingestion pipeline, including frontier scheduling, rate limiting, deduplication, parsing, and proxies.

  • Improving monitoring and logging systems to track crawler performance, crawl health, website coverage, uptime, crawl success rate, and latency

  • Pinpointing and eliminating bottlenecks in crawling and extraction, ensuring efficient resource usage and scalability.

  • Maintaining clear documentation of debugging procedures.

  • Working closely with other engineering teams to ensure crawled data flows into indexing and matching systems.

We would love you to have the following:

  • 4+ years of engineering experience designing and operating large-scale web crawling and indexing infrastructure.

  • Extensive experience with distributed systems, crawler frontier design and real-time URL prioritization.

  • Expertise in crawling frameworks, scraping libraries, and HTTP protocols.

  • Strong programming skills in Python, Go, Rust, or C++ with experience in asynchronous/network programming.

  • Hands-on experience with Puppeteer and Selenium.

  • Solid understanding of web technologies, web protocols (HTTP/HTTPS), JavaScript rendering, anti-scraping countermeasures, and headless browsers.

  • Experience with asynchronous I/O, queues, and storage systems (e.g. PostgreSQL).

  • Solid knowledge of SQL and NoSQL databases.

  • Observability familiarity (Prometheus, Grafana, alerts, logging).

  • Excellent problem-solving, analytical, and communication skills, with a proactive attitude towards system improvement.

  • Grit, smarts and persistence.

What you can expect working at TinEye:

  • You will be solving interesting and complex technical challenges.

  • You will be able to see your code deployed in production environments and powering some of our largest enterprise client implementations.

  • You will work within a small development team alongside the scrappy co-founder.

  • Your voice will be heard, and you will have the opportunity to provide input on all aspects of our work. We believe great ideas can come from our entire team.

  • Did we mention a 4-day work week (eligible after your first year of employment), health benefits, home office expense budget, a fully stocked kitchen, downtown location, free parking, last but not least, our maker space — should you, like us, be interested in tinkering with hardware.

How to apply

For immediate consideration, please submit your resume and a cover letter highlighting your relevant experience and links to past or personal projects to [email protected]. We want to get to know you, so don't be afraid to tell us about yourself, your interests and career goals. We are looking forward to meeting you.

This is not a remote position.

Company DescriptionTinEye (https://tineye.com) is an image search company. We're experts in computer vision, pattern recognition, neural networks, and machine learning. Our mission is to make images searchable.Company DescriptionTinEye (https://tineye.com) is an image search company. We're experts in computer vision, pattern recognition, neural networks, and machine learning. Our mission is to make images searchable.

Partager un emploi :