Jump to content
morellowed

Citations without links (automation help needed)

Recommended Posts

Hi,

 

This is my first post here, so hello!

 

I hope to be able to learn a lot and to contribute too.

 

I have recently read a little about scrapers and I am wondering if the following is possible:

 

I would like to paste in a list of URLs and have the scraper go to each page and look in the source code for a domain that I ask it to look for. I would then like to have the results in a spreadsheet.

 

The list of URLs would be pages where a certain business is mentioned and where there may/ may not be a link.

 

I can then follow up mentions of a business that may not have a link to see if I can get one.

 

Does anyone know how this could be done using any software?

 

Is it even a scraper that I need?

 

Any help appreciated.

 

M

Share this post


Link to post
Share on other sites

All you want back is a list of the urls from your initial list that contain the text string (in this case a business name) on the page?

 

If so that should be pretty easy to do. If you have ScrapeBox you should be able to use it. If not any scraper should be able to do that as you are not asking for much info.

 

Also if this is a one time thing or a test instead of investing in the software I  bet you can find a fiverr gig that will do it for you.

Share this post


Link to post
Share on other sites

All you want back is a list of the urls from your initial list that contain the text string (in this case a business name) on the page?

If so that should be pretty easy to do. If you have ScrapeBox you should be able to use it. If not any scraper should be able to do that as you are not asking for much info.

Also if this is a one time thing or a test instead of investing in the software I bet you can find a fiverr gig that will do it for you.

I have a list of URLS which mention a business. I now want to check if each URL has a link to the business.

If The URL does not link to the business, I want to contact them to get a link.

I want to do this on an ongoing basis.

How do I get started with a scraper?

M

Edited by morellowed

Share this post


Link to post
Share on other sites

OK I had it backwards but you are still taking a list of URLs and splitting it into two list using just one yes/no criteria which is a pretty simple task.

 

You should have no problem doing this with scrapebox if you have it. If not it may still be the cheapest option. It has been awhile but I am pretty sure I have used it for similar work.

 

I did a quick search for dedicated scraping tools and they all looked fairly expensive and very much overkill for what you want to do. If you dig deeper you may find one more reasonably priced.

 

If you know anyone with some coding skills they might be able to create a dedicated bot for that. Fairly simple task it should not take more than a few hours tops.

Share this post


Link to post
Share on other sites

Thanks. I was hoping there would be a free tool for this task, such as a scraper add on for Chrome:

Please login or register to see this link.

 

There may well be it is just going to be a matter of finding one that works for you. Just  a matter of doing searches and looking at all the choices.

 

Doing a quick search there are some free choices out there. 

 

Please login or register to see this link.

Please login or register to see this link.

 

Another chrome extension (actually not something I had even thought of)

Please login or register to see this link.

Share this post


Link to post
Share on other sites

There may well be it is just going to be a matter of finding one that works for you. Just  a matter of doing searches and looking at all the choices.

 

Doing a quick search there are some free choices out there. 

 

Please login or register to see this link.

Please login or register to see this link.

 

Another chrome extension (actually not something I had even thought of)

Please login or register to see this link.

 

Thanks for the links - I'll check these out.

 

I manged to find these guys:

Please login or register to see this link.

 

There is a Chrome extension and you can manage things in browser and there is also bit of software to download. Their support have said what I want is possible and have offered to do a demo.

Edited by morellowed

Share this post


Link to post
Share on other sites

There are free scripts out there for this if you dig. I've seen some in various languages on Github for example. Try searches with things like: check if URL exists.

 

I actually have one I used to use for the same purpose. Happy to send you the script if you want to shoot me a PM with your email. You'll need to host it locally or put it up on hosting. The functionality of what you are after is pretty basic, the more detailed stuff comes down to the details such as how it is hosted, whether you are uploading a file of URL's each time and downloading a return file, or want it web hosting with form input and outputs, scaling it do work faster and handle more etc. I could recommend some coders too if needed.

 

Out of curiousity, how are you getting your list of URL's with mentions in the first place? Scraping Google or something? If you have something already set up for that, modifying/adding to that with additional steps might be your best end result for saving time on the entire task.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×