Find Jobs
Hire Freelancers

Website Crawler / Scraper Manager

£20-250 GBP

Cerrado
Publicado hace más de 7 años

£20-250 GBP

Pagado a la entrega
I'm looking for someone to code a solid website scraper / crawler. I've already coded a version, however it is not as good as I need it to be, so I need help to create a new better version from scratch. In short I need to be able to manage (create/edit/delete) scraping tasks through a robust, flexible and advanced UI; scraping task script need to look for things to do on regular intervals (optimally as an update daemon service on my Ubuntu VPS instead of a CRON task) with data getting scraped and inserted into an MYSQL database. The sites in questions are generally news sites relating to games and tech; key data is headlines, intro and/or full content, date published, author and URL to full story (similar to what an RSS feed could provide, but these site do not have RSS feeds). Beyond use of PHP/JQuery and Ajax I expect you to use something like SimpleHTMLdom (which I used, however maybe you prefer another framework - so can be discussed) and Datatables for all types of tables (alternatively some bootstrap tables). Also note that I use a them called Metronic – Admin Dashboard for my general UI design, I can provide a default template and link in that regard. Features that will be required Advanced create/edit/delete tasks UI so that tasks to do everything can be done via the UI as far as possible to ensure a page can get scraped for data. Smart way to manage multiple page scrapes from the same website. E.g. when there is no way to fetch, news, reviews and features from a single page. List of tasks with relevant status; search, filter, sort and manage options Update daemon that can run as a background process on an VPS Ubuntu 14.04 box. This manage all the tasks based on task settings and interval criteria to fetch data. Error handling; able to recover in case of failed fetches, interruptions, re-schedule tasks etc., logging of what is going on and error’s that occurred. Error management; warnings system that flags tasks that might have issues, e.g. we’re no longer scraping a headline or an author etc. e.g. site change code that can cause issues. Happy to answer any further questions, just ask. IMPORTANT Timeline/deadlines; while I would have loved to have this done yesterday, do let me know an estimate of how much time you believe will be required to complete the project. A high level of English also required. Offers that ignores to provide this information will not be considered. See attached images for a view of my current system.
ID del proyecto: 11323928

Información sobre el proyecto

6 propuestas
Proyecto remoto
Activo hace 7 años

¿Buscas ganar dinero?

Beneficios de presentar ofertas en Freelancer

Fija tu plazo y presupuesto
Cobra por tu trabajo
Describe tu propuesta
Es gratis registrarse y presentar ofertas en los trabajos
6 freelancers están ofertando un promedio de £176 GBP por este trabajo
Avatar del usuario
You will receive EXCELLENT results from my work. My reviews speak of my excellent attention to detail and my great customer service! Please review my profile and read my client reviews (101 reviews - 5 stars). I would be grateful to have the opportunity to chat with you and discuss your project in detail. I look forward to hearing from you. Sparximer
£142 GBP en 3 días
5,0 (12 comentarios)
5,4
5,4

Sobre este cliente

Bandera de UNITED KINGDOM
Brighton, United Kingdom
5,0
5
Forma de pago verificada
Miembro desde nov 7, 2012

Verificación del cliente

¡Gracias! Te hemos enviado un enlace para reclamar tu crédito gratuito.
Algo salió mal al enviar tu correo electrónico. Por favor, intenta de nuevo.
Usuarios registrados Total de empleos publicados
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Cargando visualización previa
Permiso concedido para Geolocalización.
Tu sesión de acceso ha expirado y has sido desconectado. Por favor, inica sesión nuevamente.