Connect with us

WordPress

Critical Vulnerability Strikes WordPress Ad Inserter

Published

on

Critical Vulnerability Strikes WordPress Ad Inserter


Ad Inserter, a popular Ad management WordPress Plugin was discovered to contain a critical vulnerability. The vulnerability allows an authenticated user as low as a subscriber to execute code on the affected website. It is advised that users of the plugin update immediately.

Screenshot of WordPress Dashboard Update LinkThis is a screenshot of the WordPress dashboard. In the top left hand corner is a link that allows you to update your plugins.

Description of Ad Inserter Vulnerability

There are actually two vulnerabilities.

Authenticated Path Traversal Exploit

The first vulnerability is called an Authenticated Path Traversal Exploit. This exploit exists in Ad Inserter version 2.4.19 and under.

This is a type of exploit that allows an attacker to access to areas of a site by adding variables to the URL, variables like ../. This allows an attacker to “traverse” to an area that may allow them to execute code or see private information.

According to Common Weakness Enumeration (CWE) web page about traversal exploits,  on a website that is maintained by the U.S. Department of Homeland Security, this is how a path traversal exploit works:

“The software uses external input to construct a pathname that is intended to identify a file or directory that is located underneath a restricted parent directory, but the software does not properly neutralize special elements within the pathname that can cause the pathname to resolve to a location that is outside of the restricted directory.”

The second vulnerability is labeled as critical. The vulnerability was discovered on Friday July 12th by the WordFence team and swiftly fixed by Ad Inserter the following day, on Saturday July 13, 2019.

Authenticated Remote Code Execution

The second vulnerability is called an Authenticated Remote Code Execution (RCE). This allows any user who is registered with the site, with permissions as low as a subscriber to be able to execute arbitrary code on a WordPress installation.

The RCE exploit affects Ad Inserter version 2.4.21 and under.

According to the WordFence website:

“On Friday, July 12th, our Threat Intelligence team discovered a vulnerability present in Ad Inserter, a WordPress plugin installed on over 200,000 websites. The weakness allowed authenticated users (Subscribers and above) to execute arbitrary PHP code on websites using the plugin.

We privately disclosed the issue to the plugin’s developer, who released a patch the very next day.

This is considered a critical security issue…”

Ad Inserter Plugin Reacted Swiftly and Ethically

Almost all plugins and software may contain a vulnerability. What’s important is how quickly a developer responds to issues and how transparent the developers are about it.

Ad Inserter WordPress Plugin ChangelogScreenshot of the Ad Inserter changelog showing that they responded ethically and transparently.

The Ad Inserter team deserve praise for how quickly they responded and for their transparency about the updates. Ad Inserter alerted their users to the vulnerability through the changelog that is visible on every user’s update page. This is important because it alerts users to the urgency of the update.

The Ad Inserter team acted swiftly and ethically. That’s the best that can be expected from any WordPress developer.

Update Ad Inserter

All users of the Ad Inserter WordPress plugin are urged to log in to their WordPress installation and update their Ad Inserter plugin.

Read the WordFence announcement here.



Continue Reading
Click to comment

You must be logged in to post a comment Login

Leave a Reply

WordPress

Crawl data analysis of 2 billion links from 90 million domains offer glimpse into today’s web

Published

on

Crawl data analysis of 2 billion links from 90 million domains offer glimpse into today's web


The web is not only essential for people working in digital marketing, but for everyone. We professionals in this field need to understand the big picture of how the web functions for our daily work. We also know that optimizing our customers’ sites is not just about their sites, but also improving their presence on the web, which it is connected to other sites by links.

To get an overall view of information about the web we need data, lots of data. And we need it on a regular basis. There are some organizations that provide open data for this purpose like Httparchive. It collects and permanently stores the web’s digitized content and offers them as public dataset. A second example is Common Crawl, an organization that crawls the web every month. Their web archive has been collecting petabytes of data since 2011. In their own words, “Common Crawl is a 501(c)(3) non-profit organization dedicated to providing a copy of the internet to internet researchers, companies and individuals at no cost for the purpose of research and analysis.”

In this article, a quick data analysis of Common Crawl’s recent public data and metrics will be presented to offer a glimpse into what’s happening on the web today.

This data analysis was performed on almost two billion edges of nearly 90 million hosts. For the purposes of this article, the term “edge” will be used as a reference to a link. An edge from one host (domain) to another is counted only once if there is at least one link from one host to the other host. Also to note that the PageRank of hosts is dependent on the number of links received from other hosts but not on the number given to others.

There is also a dependency between the number of links given to hosts and the number of subdomains of a host. This is not a great surprise given that of the nearly 90 million hosts, the one receiving links from the maximum number of hosts is “googleapis.com,” while the host sending links to the maximum number of hosts is “blogspot.com.” And the host having the maximum number of hosts (subdomains) is “wordpress.com.”

The public Common Crawl data include crawls from May, June and July 2019.

The main data analysis is performed on three following compressed Common Crawl files.

These two datasets are used for the additional data analysis concerning the top 50 U.S. sites.

The Common Crawl data provided in three compressed files belongs to their recent domain-level graph. First, in the “domain vertices” file, there are 90 million nodes (naked domains). In the “domain edges” file, there are their two billion edges (links). Lastly, the file “domain ranks” contains the rankings of naked domains by their PageRank and harmonic centrality.

Harmonic centrality is a centrality measure like PageRank used to discover the importance of the nodes in a graph. Since 2017, Common Crawl has been using harmonic centrality in their crawling strategy for prioritization by link analysis. Additionally in the “domain ranks” dataset, the domains are sorted according to their harmonic centrality values, not to their PageRank values. Although harmonic centrality doesn’t correlate with PageRank on the final dataset, it correlates with PageRank in the top 50 U.S. sites data analysis. There is a compelling video “A Modern View of Centrality Measures”  where Paolo Boldi presents a comparison of PageRank and harmonic centrality measurements on the Hollywood graph. He states that harmonic centrality selects top nodes better than PageRank.

[All Common Crawl data used in this article is from May, June and July 2019.]

Preview of Common Crawl “domain vertices” dataset

Preview of common crawl “domain edges” dataset

Preview of Common Crawl “domain ranks” dataset sorted by harmonic centrality  

The preview of the final dataset obtained by three main Common Crawl datasets; “domain vertices,” “domain edges” and “domain ranks” sorted by PageRank 

Column names:

  • host_rev: Reversed host name, for example ‘google.com’ becomes ‘com.google’ 
  • n_in_hosts: Number of other hosts which the host receives at least one link from
  • n_out_hosts: Number of other hosts which the host sends at least one link to
  • harmonicc_pos: Harmonic centrality position of the host
  • harmonicc_val: Harmonic centrality value of the host
  • pr_pos: PageRank position of the host
  • pr_val: PageRank value of the host
  • n_hosts: Number of  hosts (subdomains) belonging to the host

Statistics of Common Crawl final dataset

*link : Counted as a link if there is at least one link from one host to other 

  • Number of incoming hosts: 
    • Mean, min, max of n_in_hosts  = 21.63548751, 0, 20081619
    • *The reversed host receiving links* from maximum number of hosts is ‘com.googleapis’.
  • Number of outgoing hosts: 
    • Mean, min, max of n_out_hosts  = 21.63548751, 0, 7813499
    • *The reversed host sending links* to maximum number of hosts is ‘com.blogspot’
  • PageRank 
    • mean, min, max of pr_val  = 1.13303402e-08, 0., 0.02084144
  • Harmonic centrality
    • mean, min, max of harmonicc_val  = 10034682.46655859, 0., 29977668.
  • Number of hosts (subdomains)
    • mean, min, max of n_hosts  = 5.04617139, 1, 7034608
    • *The reversed host having maximum number of hosts (subdomains) is ‘com.wordpress’’
  • Correlations
    • correlation(n_in_hosts, n_out_hosts) = 0.11155189
    • correlation(n_in_hosts, n_hosts) = 0.07653162
    • correlation(n_out_hosts, n_hosts) = 0.60220516
    • correlation(n_in_hosts, pr_val) = 0.96545709
    • correlation(n_out_hosts, pr_val) = 0.08552065
    • correlation(n_in_hosts, harmonicc_val) = 0.00527706
    • correlation(n_out_hosts, harmonicc_val) = 0.00440205
    • correlation(pr_val, harmonicc_val) = 0.00400214
    • correlation(pr_val, n_hosts) = 0.05847027
    • correlation(harmoniccc_val, n_hosts) = 0.00042441

The correlation results show that the number of incoming hosts (n_in_hosts) is correlated with PageRank value (pr_val) and number of outgoing hosts (n_out_hosts), while the former is very strong, the latter is weak. There is also a dependency between the number of outgoing hosts and number of hosts (n_hosts), subdomains of a host.

Data visualization: Distribution of PageRank

The graph below presents the plot of the count of pr_val values. It shows us that the distribution of PageRank on almost 90 million hosts is highly right skewed meaning the majority of the hosts have very low PageRank.

Distribution of the number of hosts

The following graph presents the plot of the count of n_hosts (subdomains) values. It shows us that the distribution of number of hosts (subdomains) of almost 90 million hosts is highly right-skewed meaning the majority of the hosts have a low number of subdomains.

Distribution of the number of incoming hosts

The graph below presents the plot of the count of n_in_hosts (number of incoming hosts) values. It shows us that this distribution is right-skewed, too.

Distribution of number of outgoing hosts

The following graph shows the plot of the count of n_out_hosts (number of outgoing hosts) values. Again, this distribution is also right-skewed.

Distribution of harmonic centrality 

The following graph presents the plot of the count of harmonicc_val column values. It shows that the distribution of harmonicc_val on almost 90 million hosts is not highly right-skewed like  PageRank or number of hosts distributions. It is not a perfect gaussian distribution but more gaussian than the distributions of PageRank and number of hosts. This distribution is multimodal.

Scatter plot of number of incoming hosts vs number of outgoing hosts

The graph below presents the scatter plot of the n_in_hosts in x-axis and the n_out_hosts in y-axis. It is showing that the number of outgoing and incoming hosts are not overall directly dependent on each other. In other words, when the number of links which a host receives from other hosts increase, its outgoing links to other hosts do not increase. When hosts do not have a significant number of incoming hosts, they easily give links to other hosts. However the hosts having an important number of incoming hosts are not that generous.

Scatter plot of number of incoming hosts vs. PageRank 

The graph below presents the scatter plot of the n_in_hosts values in x-axis and the pr_val values of hosts in y-axis. It shows us that there is a correlation between the number of incoming hosts to a host and its PageRank. In other words, the more hosts link to a host, the greater its PageRank value is.

Scatter plot of number of outgoing hosts vs. PageRank 

The graph below presents the scatter plot of the n_out_hosts  in x-axis and the pr_val value of hosts in y-axis. It shows us that the correlation between the number of incoming hosts and PageRank do not exist between the number of outgoing hosts and the PageRank. 

Scatter plot of PageRank and harmonic centrality 

As the majority of hosts have low PageRank, we see a vertical line when we scatter plot the PageRank and harmonic centrality values of hosts. But, we observe the detachment of the hosts’ PageRank values from the masses begins when their harmonic centrality value is closer to 1.5e7 and accelerates when it is greater than.

Top 50 US sites

Top 50 U.S. sites data are selected from the final Common Crawl dataset obtained in the beginning. Their hosts are reversed in order to match with the column “host_rev” in the Common Crawl final data set. For example, “youtube.com” becomes “com.youtube.” Below is a preview from this selection. There are 49 sites instead of 50 because “finance.yahoo.com”  doesn’t exist in Common Crawl dataset but “com.yahoo” does. 

The Majestic Million public dataset is also imported. The preview of this file is below

These two data sets; top U.S. 50 sites including Common Crawl data and metrics and the data set of Majestic Million are merged. The refips, refsubnets are summed up by reversed hosts.

The preview of this final dataset is below

Statistics of top 50 US sites final dataset

  • Number of incoming hosts: 
    • mean, min, max of n_in_hosts  = 1565724.63265306, 1015, 16537551
  • Number of outgoing hosts:
    •  mean, min, max of n_out_hosts  = 80812.70833333, 28., 2529655
  • PageRank
    • mean, min, max of pr_val  = 0.00105891, 9.73490741e-07, 0.01285745 
  • Harmonic centrality
    • mean, min, max of harmonicc_val  = 18871331.16326531, 14605537., 27867704
  • Number of hosts (subdomains)
    • mean, min, max of n_hosts  = 36426.79591837, 22, 1555402

From this dataset, which have the top 50 U.S. sites Common Crawl data and Majestic Million data, a pairwise scatterplot of metrics – pr_val, n_in_hosts, n_out_hosts, harmonicc_val, refips_sum, refsubnets_sum – are created can be seen below.

This pairwise scatter plot shows us that PageRank of the U.S. 50 top sites is somewhat correlated with all the metrics used in this graph except number of outgoing hosts, represented with legend n_out_hosts.

Below the correlation heatmap of these metrics is also available

Conclusion

The data analysis of the top 50 U.S. sites shows a dependency between the number of incoming hosts and referring IP addresses (refips) and the subdivision of an IP network that points to the target domain (refsubnets) metrics. Harmonic centrality is correlated between PageRank, number of incoming hosts, refIPs and refsubnets of the hosts.

Of the almost 90 million hosts ranks and their two billion edges (edges are links only counted once even if there are many from a single host), there is a strong correlation between PageRank and the number of incoming edges to each host. However, we can’t say the same for the number of outgoing edges from hosts.

In this data analysis, we find a correlation between the number of subdomains and the number of outgoing edges from one host to other hosts. The distribution of PageRank on this web graph is highly right-skewed meaning the majority of the hosts have very low PageRank.

Ultimately, the main data analysis tells us that the majority of domains on the web have low PageRank, a low number of incoming and outgoing edges and a low number of host subdomains. We know this because all of these features have the same highly right-skewed type of data distribution.

PageRank is still a popular and well-known centrality measure. One of the reasons for its success is its performance with similar types of data distribution comparable to the distribution of edges on domains.

Common Crawl is an invaluable and neglected public data source for SEO. The tremendous data are technically not easy to access even though they are public. However, it provides a once per three months “domain ranks” file that can be relatively easy to analyze compared to raw monthly crawl data. Due to a lack of resources, we can not crawl the web and calculate the centrality measures ourselves, but we can take advantage of this extremely useful resource to analyze our customers’ websites and their competitors rankings with their connections on the web.


Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.


About The Author

Aysun Akarsu is a trilingual data scientist specialized in machine intelligence for digital marketing wanting to help companies in making data driven decisions for reaching a broader, qualified audience. Aysun writes regulary about SEO data analysis on her blog,SearchDatalogy

Continue Reading

WordPress

If Google says H1s don’t matter for rankings, why should you use them? Here’s why

Published

on

If Google says H1s don't matter for rankings, why should you use them? Here's why


On October 3, Webmaster Trends Analyst John Mueller delivered an edition of #AskGoogleWebmasters describing how Google approaches H1 headings with regards to ranking. The explanation caused a bit of a stir.

What Google said

“Our systems aren’t too picky and we’ll try to work with the HTML as we find it — be it one H1 heading, multiple H1 headings or just styled pieces of text without semantic HTML at all,” Mueller said.

In other words, Mueller is saying Google’s systems don’t have to rely on specific headings structure to indicate the main focus of content on the page.

What’s the fuss?

Mueller’s answer would appear to counter a longstanding “best practice” to use and optimize a single H1 and subsequent headings on a page. This is even reflected in the weighting of +2 that headings were given in our own most recent Periodic Table of SEO Factors.

“This seems to directly contradict years of SEO advice I’ve been given by all the SEO experts,” Dr. John Grohol, founder of PsychCentral.com, tweeted, expressing a reaction shared by many. Others cited their own experiences of seeing how H1 implementations can affect organic visibility.

How headings are designed to be used

The hierarchy of headings communicates what the content on a page is about as well as how ideas are grouped, making it easy for users to navigate the page. Applying multiple H1s or skipping headings altogether can create a muddled page structure and make a page harder to read.

Accessibility is also a significant reason to use headings. A point made even more salient now that the courts have ruled that websites fall under the Americans with Disabilities Act.

“Heading markup will allow assistive technologies to present the heading status of text to a user,” the World Wide Web Consortium’s Web Content Accessibility Guidelines (WCAG) explains. “A screen reader can recognize the code and announce the text as a heading with its level, beep or provide some other auditory indicator. Screen readers are also able to navigate heading markup which can be an effective way for screen reader users to more quickly find the content of interest. Assistive technologies that alter the authored visual display will also be able to provide an appropriate alternate visual display for headings that can be identified by heading markup.”

Joost de Valk, founder of Yoast SEO WordPress plugin, noted that most WordPress themes are designed to have a single H1 heading just for post titles — “not for SEO (although that won’t hurt) but for decent accessibility.”

SEO consultant Alan Bleiweiss pointed to a WebAIM survey that found 69% of screen readers use headings to navigate through a page and 52% find heading levels very useful.

Many SEOs are concerned that Google’s lack of emphasis on accessibility standards, including rel=prev/next, may disincentive site owners to implement them, potentially making content harder to understand for users who depend on screen-reading technology, such as the visually impaired. Do that at your own risk.

H1s and SEO

“It is naive to think that Google completely ignores the H1 tag,” Hamlet Batista, CEO and founder of RankSense, told Search Engine Land.

“I’ve seen H1s used in place of title tags in the SERPs. So, it is a good idea to make the H1 the key topic of the page; in case this happens, you have a reasonably good headline,” Batista said, adding that having multiple H1s may provide less control of what text could appear in the search results if the H1 is used instead of the title.

Others said headings hiccups have hurt rankings.

In the comment above, which was left on Search Engine Roundtable’s coverage of the announcement, the commenter attributes the performance decline to an error that resulted in removal of H1s from his content.

You should still use proper headings

All John Mueller is saying is that Google can usually figure out what’s important on a page even when you’re not using headings or heading hierarchies. “It’s not a secret ranking push,” Mueller added in a follow up. “A script sees the page, you’re highlighting some things as ‘important parts,’ so often we can use that a bit more relative to the rest. If you highlight nothing/everything, we’ll try to figure it out.”

As Mueller said at the end of the #AskGoogleWebmasters video, “When thinking about this topic, SEO shouldn’t be your primary objective. Instead, think about your users: if you have ways of making your content accessible to them, be it by using multiple H1 headings or other standard HTML constructs, then that’s not going to get in the way of your SEO efforts.”


About The Author

George Nguyen is an Associate Editor at Third Door Media. His background is in content marketing, journalism, and storytelling.

Continue Reading

WordPress

Revenge of the small business website

Published

on

Revenge of the small business website


For several years, many SEOs have been proclaiming the end of small business (SMB) websites. The theory is that third party destinations (GMB, Facebook, Yelp, etc.) have taken over and SMB sites will rarely see consumer visits if at all. GMB now is so widely used and so complete, the argument goes, that consumers never need to visit the underlying SMB site.

Recent investments and M&A. That description of consumer behavior is partly correct but not entirely. Websites continue to be a critical SMB asset and content anchor. That fact is underscored by WordPress parent Automattic’s most recent funding round of $300 million (at a $3+ billion valuation) and Square’s April 2018 roughly $365 million acquisition of site builder Weebly.

On a smaller scale, ten-year old web design platform Duda recently raised $25 million (for just under $50 million in total funding). Duda has a network of more than 6,000 third party resellers and agencies that work with SMBs. It will continue to focus on websites and presence management rather than expand horizontally into other marketing channels.

New Yahoo web design service. In addition, late last week Verizon-owned Yahoo launched a new web design product for SMBs. There are two service tiers ($99 and $299 per month). The offering includes design consultation, ongoing maintenance and content updates (it’s a SaaS product).

Yahoo Small Business was at one time the premier hosting company for SMBs. During a long period of somnambulance, it was surpassed by GoDaddy and others. But following Verizon’s $4+ billion acquisition of Yahoo in 2016, the company has sought to invest and develop new small business products and services and regain momentum. Its brand has remained relatively strong among SMBs across the U.S. despite the decline of Yahoo itself.

Now, Yahoo is developing a new generation of marketing products and services for SMBs. The web design service is just the first announcement.

SMB sites more trusted, still visited. A May 2019 consumer survey from BrightLocal found nearly twice as many respondents (56%) expected SMB websites to be accurate compared with Google My Business (32%). This was a surprise. However, a 2018 survey from the SEO firm found that the most common consumer action by a fairly significant margin, after reading a positive review, was to visit the SMB’s website.

Why we should care. The Small Business Administration says (.pdf) there are now roughly 30 million SMBs in the U.S. The SBA defines “small business” as having a headcount of up to 499 employees. There’s a massive difference between a firm with three or even 20 employees and one that has 300. Regardless, well over 90% of U.S. SMBs have fewer than 10 employees.

While a majority of SMBs in theory, now have websites — 64% according to a 2018 Clutch survey — there’s still a significant opportunity for providers of websites. New businesses form and fail every quarter. And even with shrinking reach in organic search and social, websites are likely remainto the anchor of SMB digital marketing into the foreseeable future.


About The Author

Greg Sterling is a Contributing Editor at Search Engine Land. He writes about the connections between digital and offline commerce. He previously held leadership roles at LSA, The Kelsey Group and TechTV. Follow him Twitter or find him on LinkedIn.

Continue Reading

Trending

Copyright © 2019 Plolu.