<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Share Good Things &#187; Scraping</title>
	<atom:link href="http://www.sharegoodthings.net/tag/scraping/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.sharegoodthings.net</link>
	<description></description>
	<lastBuildDate>Wed, 04 May 2022 11:22:42 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.6</generator>
		<item>
		<title>Does the 9th Circuit’s decision in HiQ vs. LinkedIn open the floodgates to scraping?</title>
		<link>http://www.sharegoodthings.net/does-the-9th-circuits-decision-in-hiq-vs-linkedin-open-the-floodgates-to-scraping/</link>
		<comments>http://www.sharegoodthings.net/does-the-9th-circuits-decision-in-hiq-vs-linkedin-open-the-floodgates-to-scraping/#comments</comments>
		<pubDate>Mon, 16 Sep 2019 10:22:45 +0000</pubDate>
		<dc:creator>gftshappy</dc:creator>
				<category><![CDATA[SEO]]></category>
		<category><![CDATA[Circuit’s]]></category>
		<category><![CDATA[Decision]]></category>
		<category><![CDATA[floodgates]]></category>
		<category><![CDATA[LinkedIn]]></category>
		<category><![CDATA[Open]]></category>
		<category><![CDATA[Scraping]]></category>

		<guid isPermaLink="false">http://www.sharegoodthings.net/does-the-9th-circuits-decision-in-hiq-vs-linkedin-open-the-floodgates-to-scraping/</guid>
		<description><![CDATA[The case may still be reheard or appealed but it appears to be a broad ruling in favor of &#8220;the open internet.&#8221; Please visit Search Engine Land for the full article. Search Engine Land: News &#038; Info About SEO, PPC, &#8230; <a href="http://www.sharegoodthings.net/does-the-9th-circuits-decision-in-hiq-vs-linkedin-open-the-floodgates-to-scraping/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>The case may still be reheard or appealed but it appears to be a broad ruling in favor of &#8220;the open internet.&#8221; <br/> <br/> Please visit Search Engine Land for the full article.
<div class="feedflare"> <a target="_blank" href="http://feeds.searchengineland.com/~ff/searchengineland?a=NaUq9y_cFok:cnLBjOk4DEw:yIl2AUoC8zA"><img src="http://www.sharegoodthings.net/wp-content/uploads/2019/09/d71dd__searchengineland?d=yIl2AUoC8zA.jpg" border="0"></img></a> <a target="_blank" href="http://feeds.searchengineland.com/~ff/searchengineland?a=NaUq9y_cFok:cnLBjOk4DEw:-BTjWOF_DHI"><img src="http://www.sharegoodthings.net/wp-content/uploads/2019/09/d71dd__searchengineland?i=NaUq9y_cFok:cnLBjOk4DEw:-BTjWOF_DHI.jpg" border="0"></img></a> <a target="_blank" href="http://feeds.searchengineland.com/~ff/searchengineland?a=NaUq9y_cFok:cnLBjOk4DEw:F7zBnMyn0Lo"><img src="http://www.sharegoodthings.net/wp-content/uploads/2019/09/154bd__searchengineland?i=NaUq9y_cFok:cnLBjOk4DEw:F7zBnMyn0Lo.jpg" border="0"></img></a> <a target="_blank" href="http://feeds.searchengineland.com/~ff/searchengineland?a=NaUq9y_cFok:cnLBjOk4DEw:7Q72WNTAKBA"><img src="http://www.sharegoodthings.net/wp-content/uploads/2019/09/154bd__searchengineland?d=7Q72WNTAKBA.jpg" border="0"></img></a> <a target="_blank" href="http://feeds.searchengineland.com/~ff/searchengineland?a=NaUq9y_cFok:cnLBjOk4DEw:V_sGLiPBpWU"><img src="http://www.sharegoodthings.net/wp-content/uploads/2019/09/154bd__searchengineland?i=NaUq9y_cFok:cnLBjOk4DEw:V_sGLiPBpWU.jpg" border="0"></img></a> <a target="_blank" href="http://feeds.searchengineland.com/~ff/searchengineland?a=NaUq9y_cFok:cnLBjOk4DEw:qj6IDK7rITs"><img src="http://www.sharegoodthings.net/wp-content/uploads/2019/09/154bd__searchengineland?d=qj6IDK7rITs.jpg" border="0"></img></a> <a target="_blank" href="http://feeds.searchengineland.com/~ff/searchengineland?a=NaUq9y_cFok:cnLBjOk4DEw:V-t1I-SPZMU"><img src="http://www.sharegoodthings.net/wp-content/uploads/2019/09/154bd__searchengineland?d=V-t1I-SPZMU.jpg" border="0"></img></a> </div>
<p><img src="http://www.sharegoodthings.net/wp-content/uploads/2019/09/154bd__NaUq9y_cFok.jpg" height="1" width="1" alt=""/><br />
<a target="_blank" rel="nofollow" href="http://feeds.searchengineland.com/~r/searchengineland/~3/NaUq9y_cFok/does-the-9th-circuits-new-decision-in-hiq-vs-linkedin-open-the-floodgates-to-scraping-321687">Search Engine Land: News &#038; Info About SEO, PPC, SEM, Search Engines &#038; Search Marketing</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.sharegoodthings.net/does-the-9th-circuits-decision-in-hiq-vs-linkedin-open-the-floodgates-to-scraping/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data scraping tools for marketers who don’t know code</title>
		<link>http://www.sharegoodthings.net/data-scraping-tools-for-marketers-who-dont-know-code/</link>
		<comments>http://www.sharegoodthings.net/data-scraping-tools-for-marketers-who-dont-know-code/#comments</comments>
		<pubDate>Sat, 20 Jul 2019 16:22:20 +0000</pubDate>
		<dc:creator>gftshappy</dc:creator>
				<category><![CDATA[SEO]]></category>
		<category><![CDATA[Code]]></category>
		<category><![CDATA[Data]]></category>
		<category><![CDATA[don't]]></category>
		<category><![CDATA[Know]]></category>
		<category><![CDATA[Marketers]]></category>
		<category><![CDATA[Scraping]]></category>
		<category><![CDATA[Tools]]></category>

		<guid isPermaLink="false">http://www.sharegoodthings.net/data-scraping-tools-for-marketers-who-dont-know-code/</guid>
		<description><![CDATA[Here are some free software options to extract data from small to medium data sets to help you get the job done. Please visit Search Engine Land for the full article. Search Engine Land: News &#038; Info About SEO, PPC, &#8230; <a href="http://www.sharegoodthings.net/data-scraping-tools-for-marketers-who-dont-know-code/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>Here are some free software options to extract data from small to medium data sets to help you get the job done.<br/> <br/> Please visit Search Engine Land for the full article.
<div class="feedflare"> <a target="_blank" href="http://feeds.searchengineland.com/~ff/searchengineland?a=Ajj34ImnPls:_1VhFfaBKTA:yIl2AUoC8zA"><img src="http://www.sharegoodthings.net/wp-content/uploads/2019/07/d6154__searchengineland?d=yIl2AUoC8zA.jpg" border="0"></img></a> <a target="_blank" href="http://feeds.searchengineland.com/~ff/searchengineland?a=Ajj34ImnPls:_1VhFfaBKTA:-BTjWOF_DHI"><img src="http://www.sharegoodthings.net/wp-content/uploads/2019/07/d6154__searchengineland?i=Ajj34ImnPls:_1VhFfaBKTA:-BTjWOF_DHI.jpg" border="0"></img></a> <a target="_blank" href="http://feeds.searchengineland.com/~ff/searchengineland?a=Ajj34ImnPls:_1VhFfaBKTA:F7zBnMyn0Lo"><img src="http://www.sharegoodthings.net/wp-content/uploads/2019/07/d6154__searchengineland?i=Ajj34ImnPls:_1VhFfaBKTA:F7zBnMyn0Lo.jpg" border="0"></img></a> <a target="_blank" href="http://feeds.searchengineland.com/~ff/searchengineland?a=Ajj34ImnPls:_1VhFfaBKTA:7Q72WNTAKBA"><img src="http://www.sharegoodthings.net/wp-content/uploads/2019/07/d6154__searchengineland?d=7Q72WNTAKBA.jpg" border="0"></img></a> <a target="_blank" href="http://feeds.searchengineland.com/~ff/searchengineland?a=Ajj34ImnPls:_1VhFfaBKTA:V_sGLiPBpWU"><img src="http://www.sharegoodthings.net/wp-content/uploads/2019/07/d6154__searchengineland?i=Ajj34ImnPls:_1VhFfaBKTA:V_sGLiPBpWU.jpg" border="0"></img></a> <a target="_blank" href="http://feeds.searchengineland.com/~ff/searchengineland?a=Ajj34ImnPls:_1VhFfaBKTA:qj6IDK7rITs"><img src="http://www.sharegoodthings.net/wp-content/uploads/2019/07/d6154__searchengineland?d=qj6IDK7rITs.jpg" border="0"></img></a> <a target="_blank" href="http://feeds.searchengineland.com/~ff/searchengineland?a=Ajj34ImnPls:_1VhFfaBKTA:V-t1I-SPZMU"><img src="http://www.sharegoodthings.net/wp-content/uploads/2019/07/d6154__searchengineland?d=V-t1I-SPZMU.jpg" border="0"></img></a> </div>
<p><img src="http://www.sharegoodthings.net/wp-content/uploads/2019/07/d6154__Ajj34ImnPls.jpg" height="1" width="1" alt=""/><br />
<a target="_blank" rel="nofollow" href="http://feeds.searchengineland.com/~r/searchengineland/~3/Ajj34ImnPls/data-scraping-tools-for-marketers-who-dont-know-code-319446">Search Engine Land: News &#038; Info About SEO, PPC, SEM, Search Engines &#038; Search Marketing</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.sharegoodthings.net/data-scraping-tools-for-marketers-who-dont-know-code/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Scraping Schema Markup for Competitive Intelligence</title>
		<link>http://www.sharegoodthings.net/scraping-schema-markup-for-competitive-intelligence/</link>
		<comments>http://www.sharegoodthings.net/scraping-schema-markup-for-competitive-intelligence/#comments</comments>
		<pubDate>Mon, 16 Sep 2013 17:19:52 +0000</pubDate>
		<dc:creator>gftshappy</dc:creator>
				<category><![CDATA[SEO]]></category>
		<category><![CDATA[Competitive]]></category>
		<category><![CDATA[Intelligence]]></category>
		<category><![CDATA[Markup]]></category>
		<category><![CDATA[Schema]]></category>
		<category><![CDATA[Scraping]]></category>

		<guid isPermaLink="false">http://www.sharegoodthings.net/scraping-schema-markup-for-competitive-intelligence/</guid>
		<description><![CDATA[Structured mark up is crucial for e-commerce websites if they want to stand out in the SERPs. Because e-commerce sites are generally set up to scale, scraping all of their information is very easy. All it takes is a Screaming &#8230; <a href="http://www.sharegoodthings.net/scraping-schema-markup-for-competitive-intelligence/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>Structured mark up is crucial for e-commerce websites if they want to stand out in the SERPs. Because e-commerce sites are generally set up to scale, scraping all of their information is very easy. All it takes is a Screaming Frog crawl and Outwit Hub.</p>
<p>For dropshippers and affiliate sites, harvesting competitor data within schema mark up tags can be extremely useful. If you are selling the same products as your competitors, you can compare pricing, product descriptions, calls to action/special promotions – anything – and analyze how you stack up against your competitors.</p>
<p>Before we can start, we need to figure out where products live on the competitor site. If your competitor has clearly built out information architecture, it shouldn’t be too tough. On Target.com, they use the directory  /p/ for their products.</p>
<h3 style="text-align: center;"><img class="aligncenter  wp-image-14513" alt="target_IA_example" src="http://www.sharegoodthings.net/wp-content/uploads/2013/09/274d9__target_IA_example.png" width="527" height="402" /></h3>
<h2>Step 1) Crawl and Collect Product Pages</h2>
<p>In order to get the pages that live under the /p/ directory, fire up <a href="http://www.screamingfrog.co.uk/seo-spider/" target="_blank">Screaming Frog</a> and under Configuration &gt; Include,  add .*/p/.*</p>
<p><img class="aligncenter size-full wp-image-14514" alt="include p directory to snag products" src="http://www.sharegoodthings.net/wp-content/uploads/2013/09/274d9__include_for_products_screaming_frog.png" width="414" height="347" /></p>
<p style="text-align: center;"><strong>Now your Screaming Frog export will only include product pages</strong></p>
<p style="text-align: left;">So everyone can follow along and work with the same data, I’ve gone ahead and scraped all the laptops that are currently listed on the Target.com site, which you can get here:</p>
<p style="text-align: center;"><a href="https://docs.google.com/spreadsheet/pub?key=0AqOQkR7YOK3XdFUyN1lmdFBQZW5kejI2bi1hQkVFTVE&amp;single=true&amp;gid=0&amp;output=html" target="_blank"> List of Target Laptops (09/10/2013)</a></p>
<h2 style="text-align: left;">Step 2) Analyze Structured Markup and On Page Elements</h2>
<p>Take one of the product pages from your Screaming Frog Export, for this example, we’ll use the <a href="http://www.target.com/p/acer-aspire-11-6-touch-screen-laptop-pc-s7-191-6640-us-with-128gb-ssd-4gb-memory-silver/-/A-14292747#prodSlot=large_1_30" target="_blank">Acer Aspire 11.6″ Touch Screen Laptop PC</a> page. If you <a href="http://www.google.com/webmasters/tools/richsnippets?q=http%3A%2F%2Fwww.target.com%2Fp%2Facer-aspire-11-6-touch-screen-laptop-pc-s7-191-6640-us-with-128gb-ssd-4gb-memory-silver%2F-%2FA-14292747%23prodSlot%3Dlarge_1_30" target="_blank">enter the URL into the Rich Snippet Testing Tool</a> you can see that Target is using a ton of structured markup on their product pages.</p>
<p><strong>For this, exercise, we’re going to scrape:</strong></p>
<ul>
<li><strong>Price</strong></li>
<li><strong>SKU</strong></li>
<li><strong>Product Name</strong></li>
<li><strong>Battery Charge Life (non-schema element)</strong></li>
<li><strong>Call to action/Promotion (non-schema element)</strong></li>
</ul>
<h2>Step 3) Fire up OutWit Hub</h2>
<p><img class="aligncenter size-full wp-image-14536" alt="Outwit Hub Logo" src="http://www.sharegoodthings.net/wp-content/uploads/2013/09/36ea7__Outwit-Hub-Logo.gif" width="155" height="163" /></p>
<p><a href="http://www.outwit.com/" target="_blank">Outwit Hub</a> is a desktop scraper/data harvester. It costs $ 60 a year and is well worth it. Outwit can utilize cookies, so scraping behind a pay-wall or password protected site is a non-issue. Instead of having to use Xpath to scrape data, Outwit Hub lets you highlight the source code and set markers to scrape everything that lies in between. If you are not a technical marketer, and you find yourself having to collect a lot of data/wasting your time – this is a good tool to have in your arsenal.</p>
<h2> Step 4) Build Your Scraper</h2>
<p><em>This may be intimidating at first, but it’s so much more scalable then trying to use Excel or Google Docs to scrape 1000s of data points</em></p>
<p>In the right-hand menu, click on Scrapers. Enter the example Target URL. This will load the source code.</p>
<p>Click on the “New” Button on the lower portion of the screen and name your scraper. I’m calling mine, “<strong>Target Laptop Scraper</strong>.”</p>
<p style="text-align: center;"><img class=" wp-image-14516  aligncenter" alt="Outwit_Scraper_Build" src="http://www.sharegoodthings.net/wp-content/uploads/2013/09/36ea7__Outwit_Scraper_Build.png" width="542" height="349" /></p>
<p style="text-align: left;">In the search box, start entering in the markup for the schema tags you want to scrape for. Remember this isn’t Xpath, you don’t need to worry about the DOM, you only need to figure out what unique source code goes before the element (the schema tag) and what’s after it.</p>
<h2 style="text-align: center;">Extreme Close Up!</h2>
<p style="text-align: center;"><a href="http://www.sharegoodthings.net/wp-content/uploads/2013/09/36ea7__Scraper_Build_Close_Up.png" target="_blank"><img class="aligncenter  wp-image-14519" alt="Scraper_Build_Close_Up" src="http://www.sharegoodthings.net/wp-content/uploads/2013/09/36ea7__Scraper_Build_Close_Up.png" width="685" height="177" /></a></p>
<p>It will take some practice at first, but once you get the hang of it, it will only take a few minutes to set up a custom scraper.</p>
<h2>Step 5) Test Your Scraper</h2>
<p>Once you’re done entering in the markers for the data you want to collect, hit the execute button and test your results. You should see something like this:</p>
<p style="text-align: center;"><img class="aligncenter  wp-image-14525" alt="scraper_test_for_outwit_hub" src="http://www.sharegoodthings.net/wp-content/uploads/2013/09/3872c__scraper_test_for_outwit_hub.png" width="714" height="102" /></p>
<p>&nbsp;</p>
<h2> Step 6) Put the list of URLs into a .txt file and save it.</h2>
<div class="wp-caption aligncenter" id="attachment_14543" style="width: 510px;"><img class="size-full wp-image-14543" alt="disks for saving" src="http://www.sharegoodthings.net/wp-content/uploads/2013/09/3872c__disks-for-saving.jpg" width="500" height="333" /></p>
<p class="wp-caption-text">Any of these storage devices or your local machine will do</p>
</div>
<h2> Step 7) Open the .txt file in Outwit using the file menu</h2>
<p>If you go to the left navigation, just under the main directory, there is a subdirectory called “Links.” Click on Links in the left-hand nav. This is what you should see:</p>
<p style="text-align: center;"><a href="http://www.sharegoodthings.net/wp-content/uploads/2013/09/9d608__tons_of_links_from_outwit.png" target="_blank"><img class="aligncenter  wp-image-14528" alt="a list of links from outwit to scrape" src="http://www.sharegoodthings.net/wp-content/uploads/2013/09/9d608__tons_of_links_from_outwit.png" width="523" height="351" /></a></p>
<p style="text-align: left;">Select all the data using Control+A and then right click on the row with all the URLs.</p>
<h2> Step 7) Fast Scrape!</h2>
<p style="text-align: center;"><a href="http://www.sharegoodthings.net/wp-content/uploads/2013/09/9d608__Scraping_tons_of_links_with_outwit.png" target="_blank"><img class="aligncenter  wp-image-14527" alt="scraping tons of schema with outwit" src="http://www.sharegoodthings.net/wp-content/uploads/2013/09/9d608__Scraping_tons_of_links_with_outwit.png" width="546" height="485" /></a></p>
<p>In the right click menu, select: Auto-Explore &gt;Fast Scrape (Include Selected Data) &gt; And select the scraper we just built together.</p>
<p style="text-align: center;"><a href="http://screencast.com/t/ntOZZzlYd38a" target="_blank">Here’s a video of the last step in Outwit</a></p>
<h2>Step 8) Bask in the glory of your competitor’s data</h2>
<p style="text-align: center;"> <img class="wp-image-14539" alt="scraped pricing data from target using outwit" src="http://www.sharegoodthings.net/wp-content/uploads/2013/09/265c2__end_product_-_scraped_pricing_data_from_target.png" width="637" height="383" /></p>
<p style="text-align: left;">In the left-hand navigation, there is a category called “data”, with the subcategory “scraped” – just in case you navigated away from it, that’s where all your data will be stored, just be careful not to load a new URL in Outwit Hub or else it will be written over and you will have to scrape all over again.</p>
<p style="text-align: left;">You can export your data into HTML, TXT, CSV, SQL or Excel. I generally just go for an Excel export and do a <a href="http://seogadget.com/using-vlookup/" target="_blank">VLOOKUP</a> to combine the data with the original Screaming Frog crawl from step one in Excel.</p>
<h1 style="text-align: center;">Got any fun potential use cases?</h1>
<h1 style="text-align: center;">Share them below in the comments!</h1>
<pre style="text-align: left;">Image source via <a href="http://www.flickr.com/photos/avaragado/" target="_blank">Flickr user avargado</a></pre>
<p>The post <a href="http://seogadget.com/scraping-schema/" target="_blank">Scraping Schema Markup for Competitive Intelligence</a> appeared first on <a href="http://seogadget.com" target="_blank">SEOgadget</a>.</p>
<p><img alt="" src="http://www.sharegoodthings.net/wp-content/uploads/2013/09/265c2__Isqz0h9VKjE.jpg" width="1" height="1" /><br />
<a href="http://feedproxy.google.com/~r/seogadget/~3/Isqz0h9VKjE/" target="_blank" rel="nofollow">SEOgadget</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.sharegoodthings.net/scraping-schema-markup-for-competitive-intelligence/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
