Find Jobs
Hire Freelancers

Write some Software

$30-250 USD

Terminado
Publicado hace más de 8 años

$30-250 USD

Pagado a la entrega
We are seeking a computer programmer to build a custom web scraper for the World News Connection database that is hosted by East View. The scraper would be using a custom list of inputs and the output would be the number of hits (the number of unique articles that mentions the unit of analysis) for each input for a specified time interval. The output would need to include results by each time unit (Day, Week, Month, Year) . We would prefer the programmer to use a publicly available, open source scraper, such as Scrappy, so that the scraper/crawler could be adjusted if there were changes in our needs or to the World News Connection database. We would provide access to the backend code of World News Connections. The scraper would require the following features: 1) The ability to uploaded two separate lists of search terms to the scraper This first list are the units of analysis (UOA) and the second are keywords. For each UOA, a search must be conducted with each keyword. Both lists should be included in quotes (""). An example below shows two lists, one of units of analysis and one of keywords. UOA Keywords Obama Iran Putin Syria China Using these units of analysis, each item should be searched as follows: “Obama” AND “Iran” “Obama” AND “Syria” “Obama” AND “China” “Putin” AND “Iran” “Putin” AND “Syria” “Putin” AND “China” A user can search World News Connections using only Units of Analysis but CANNOT search using only keywords. 2) The user should be able to select the time start/end date in days, weeks (Monday to Sunday), months or years from January 1st, 1995 to December 30th, 2013(These are the dates for the entire database). The source date NOT the insertion date should be searched. A user could input these time intervals as a third list if needed, however we would prefer the user to be able to select only three items and the scraper would output these units itself. For example, if a user wanted to output a search for “British Petroleum” for each day from June 25th, 2012 to July 7th 2012, the user should only need to select three things: 1. Start Date: June 25th, 2012 2. End Date: July 7th, 2012 3. Interval: Day (Could be week, month, year). The output should then be shown as below: "Year" "Month" "Day" 2012 6 25 2012 6 26 2012 6 27 3) A GUI for the scraper so that we would be able to run the scraper at the time of our choosing and without reliance on the programmer. 4) A sustainable program that could be executed at any time by users. 5) The scraper must be able to search for the treaty names in the order the words appear “Charter of the United Nations” vs. “Charter”…”United Nations” 6) The output should be shown as follows: Unit of analysis "Keyword" "Count" "Year" "Month" "Day" Ukraine protest 0 2014 4 1 Ukraine protest 0 2014 4 2 Ukraine protest 3 2014 4 3 Ukraine protest 0 2014 4 4 Ukraine protest 0 2014 4 5 Ukraine crimea 8 2014 4 1 Ukraine crimea 7 2014 4 2 Ukraine crimea 7 2014 4 3 Ukraine crimea 5 2014 4 4 Ukraine crimea 9 2014 4 5 Thailand protest 1 2014 4 1 Thailand protest 4 2014 4 2 Thailand protest 5 2014 4 3 Thailand protest 0 2014 4 4 7) The scraper must be able to access WNC at varying intervals, determined by the user, so that we do not disturb the World News Connection server. The user can set how often to wait in between search terms in seconds or minutes. The developer should have proven experience in Python and Webscraping. Data science expertise or previous experience with an open source scrapping program is a plus. The source code of the search and results page and the associated links will be provided. If you have any questions, please feel free to reach out.
ID del proyecto: 8429143

Información sobre el proyecto

9 propuestas
Proyecto remoto
Activo hace 9 años

¿Buscas ganar dinero?

Beneficios de presentar ofertas en Freelancer

Fija tu plazo y presupuesto
Cobra por tu trabajo
Describe tu propuesta
Es gratis registrarse y presentar ofertas en los trabajos
Adjudicado a:
Avatar del usuario
thanks for information but i need to see site and html structure Regards
$250 USD en 5 días
4,9 (366 comentarios)
7,0
7,0
9 freelancers están ofertando un promedio de $290 USD por este trabajo
Avatar del usuario
A proposal has not yet been provided
$263 USD en 5 días
4,9 (151 comentarios)
7,2
7,2
Avatar del usuario
Hi. I'm not sure why the bid is so small, this project require more time and founds. Anyway, I read the project and got interested. I would like to know more details. Here are the questions so far: 1. On which OS will you run all this? 2. I would like to know how we access this "World News Connection database"? Can you give me example ? 3. Confused about "A user can search World News Connections using only Units of Analysis but CANNOT search using only keywords. ". You give as input those 2 lists: keywords and UOA. Please elaborate on this. 4. Do you need a web based UI of desktop one? So, I see this as web based service where each user can login, have there own account and each can create this search patterns for which he may start a spider. And when spider is finished he can see the results also on this web app or can download as CSV. Yes, I'm planning to use scrapy and run the spiders via scrapyd. Let me know the details and we can continue. Thx, George.
$1.111 USD en 15 días
4,9 (91 comentarios)
7,1
7,1
Avatar del usuario
Hello Dear, I can do this for you. Please send a massage in the PMB for details.......Best Regards flashsaiful
$147 USD en 3 días
4,8 (129 comentarios)
6,5
6,5
Avatar del usuario
Please Read! I have read over your project for scraping the WNC Database and performing analysis on said data, and I am certain that I can provide results that you will be happy with. I have created very similar scripts in the past (please contact me so that I may share some examples). I am a professional programmer with several years of scripting experience using Python. Please contact me, and I will begin right away. Thank you for your time, and I hope to work with you soon.
$166 USD en 3 días
4,9 (68 comentarios)
6,5
6,5
Avatar del usuario
hi , please provide source code of the search and results page and the associated links . I'll use Perl to develop the custom scrapper for this project
$200 USD en 3 días
5,0 (47 comentarios)
6,2
6,2
Avatar del usuario
Scrapy is more of a crawler than a scraper, I have used it to scrape some online newspaper sites in the past (for someone doing some work on language processing), so this is a very similar project for me. Just contact me to discuss further. Thanks.
$222 USD en 3 días
4,8 (8 comentarios)
3,5
3,5

Sobre este cliente

Bandera de UNITED STATES
United States
5,0
55
Forma de pago verificada
Miembro desde ene 18, 2009

Verificación del cliente

¡Gracias! Te hemos enviado un enlace para reclamar tu crédito gratuito.
Algo salió mal al enviar tu correo electrónico. Por favor, intenta de nuevo.
Usuarios registrados Total de empleos publicados
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Cargando visualización previa
Permiso concedido para Geolocalización.
Tu sesión de acceso ha expirado y has sido desconectado. Por favor, inica sesión nuevamente.