SEO Crawl Analysis with Python, Analyzing Analytics and Search Console Data: Basic Codes

Discuss topics related to the USA Database.
Post Reply
mstlucky8072
Posts: 30
Joined: Mon Dec 09, 2024 4:04 am

SEO Crawl Analysis with Python, Analyzing Analytics and Search Console Data: Basic Codes

Post by mstlucky8072 »

Excel and Google Sheets have become the lifeblood of all of us. From the most basic to the most complicated tasks, they are definitely included in our daily routine somewhere. Especially for SEO analysts, learning advanced formulas and doing their jobs faster is an extra skill. However, as the data grows, even the most logical formula written can lose its meaning while waiting . Excel can explode, Sheets can not respond, and even if we do get a response, it responds so late that we can spend hours writing formulas and waiting. You have definitely tried to do business in front of the screen with various prayers, supplications and totems. In short, working with large websites with a lot of data can sometimes turn into chaos. This is where Python comes into play and it is a language that has started to become quite popular especially in the SEO world in recent years. It has advantages such as being easy to learn and use at a basic level, and processing and returning quickly even with very large data.

My aim in this article is to show how we can quickly do the work we do in Excel using Python's Excel Pandas library with the most basic and simple codes . I will even try to quickly analyze and gain insight into the things we can't think of, or the things that would take a long time if they did, with a few lines of code.

Don't be afraid, coding and learning languages ​​is a whole homeowner database different world, a whole different endeavor, but knowing enough code to do your daily routine quickly, (I will try to give most of them here) and even getting enough to do your job will save you a lot of time. If you like it a lot, you can continue with modeling and up to Machine Learning and Deep Learning of course If you want to see what you can do in SEO when you learn Python at an advanced level, Hamlet Batista's writings or JR Oakes' writings will broaden your horizons a lot.

Image


Let's get started now: There is a lot of information on the internet about how to download and use Python on your computer. Let's get straight to the code.

Loading Libraries
In this article, we will only use Pandas for dataframes and Mathplotlib for graphs , but there are many libraries in Python. I am adding other libraries here as examples, and if you want to progress in data analysis, these libraries will be very useful.


Here we read our scan file with pd.read_xlsx and saved the information of this file in df1 . We can also use the pd.read_csv extension for the 'csv' file. We also display our first five rows with the head() command. The reason we use skiprows=1 is that the column names are in the second row of the Screaming Frog output, not the first row. Let's say you want to see your pages that give a 404 response code and save them to a different file :

df404 = df1[df1[ 'Status Code' ] == 404 ]
df404


In Python, we use the sign to show equality , and the != sign to show not equal . Now let's save it in a separate file:

We transferred the pages that gave a 404 response code to a file called export404.csv . Therefore, with the codes you will write on different lines , you can keep duplicate pages, redirect pages, pages without title tags and many other crawl outputs in different places and by running only these codes in the next crawl analysis, you can have your crawl outputs ready in seconds before manual review. For example, let's detect pages without title tags:
Here, the isnull() command detects NaN, or non-existent values, in the Title 1 column. .loc is required to detect and print those rows. If you write ' Meta Description 1 ' instead of Title 1, this time it will show you the pages that do not have a meta description tag. Again, as we showed in the 404 example above, you can continue your analysis by saving this data to different files. For example, you want to know the density of NaN rows that do not have information. Imagine doing this by filtering them one by one in Excel. Instead, you can get an idea about your data with a single line of code in Python:

Combining Crawl Analysis with Google Analytics Data
Now that we have completed our basic crawl analysis, we can do a little more in-depth analysis by matching this data with Analytics data . First, get the Landing Page report for the last month or the last 3 months depending on the data you want to examine . We will combine the two files based on URLs. But first, let's read our file as we did with the crawl file.
Post Reply