Links service : three years fighting against spam

🕒 1 month(s) ago

For this fiftieth article of our site, here is a technical presentation of our Links service, used to shorten URLs.

Launched in May 2019, first using the software Lstu written by Luc Didry, this service now uses a home-made software brick, named rs-short. We count it as one of the first services of the association, with our DoH service.

During three years, this service has required a lot of moderation work, much more than other services: more than two hours of volunteer time per month were dedicated to fight against an illegal use of this service.

In this article, we offer you a retrospective analysis of this never-ending fight against phishing.

Usage statistics

Public statistics of our Links service

View complete usage statistics as of March 2022

We receive traffic from 650 different IPs every day, for just over 1 GB of traffic per month. So we are still very small scale.

Our service is used to create 20 to 40 links per day on average.

Out of these 20 to 40 links created each day, we count between 0 and 3 links on average that distribute illegal content, and that we have to disable promptly. This of course only includes the fraudulent links that we have managed to detect.

During the week of March 28 to April 3, 2022, we counted 204 links created, including 3 fraudulent links. However, sometimes our service experiences waves of fraudulent traffic, and a malicious person may create about 20 fraudulent links in a single day.

In total, at the time of writing this article, 14 150 links have been created with our Links service, including 937 links that we have marked as fraudulent.

What's the point?

For a malicious person who wants to spread malicious content on the Internet, what is the point of using our little link shortener?

Filtering processes

Some spam filters, especially at email hosts, use a Bayesian filter to estimate the probability that an email received is illegitimate.

Other methods can be used to fight spam. Social networks such as Facebook or Twitter perform other forms of automated censorship, but this moderation is generally quite opaque: we do not know under which criteria a message can be considered spam.

We can assume, however, that most of these filters analyze the reputation of posted links, and in particular domain names, and may use various third-party services to accomplish this goal, starting with the Google Safe Browsing database (GSB) which is used by Firefox to block browsing to suspicious links.

Firefox phishing alert

Using the reputation of other sites

Here's how this impacts our nasty hooded hacker: when they share a link to a malicious page, their link could be detected and blocked quickly, especially if our hacker is using a newly acquired domain name that will have a low reputation to begin with.

Their solution would then be to use domain names that have been registered since a long time, with a decent reputation. This is precisely what a link shortener allows: the shortener's domain name is used as a gateway, and it is its reputation that will be at stake when the link is distributed and redirected to the malicious page.

What are the consequences?

When a phishing link is created, we need to act quickly: once the link is created, our hacker will spread the link within the hour, sometimes to tens of thousands of email addresses, sometimes through hacked email or social network accounts. This often results in a spike in traffic to our service.

Graph showing created fraudulent link during those two last years

Domain name reputation

The first consequence that this can have is that the URL of the shortcut s.42l.fr/mylink is marked as fraudulent by services like Google Safe Browsing. This may lower the reputation of our domain name, but without any direct impact on other Internet users.

It has already happened that our domain name dedicated to this service (s.42l.fr) has been marked as fraudulent, especially by email filtering services. In these cases, it is more difficult to get out: it affects all the people who use our link shortener. It could also affect the reception rate of our emails sent from our mail server, as it shares the same IP.

What also worries us, and what motivates us to moderate this service very actively, is that the reputation of the parent domain name, 42l.fr, may be impacted over time. This has never happened before, but some reputation analysis services might work in this way and this would impact all our services; likewise if our IP address is reported as suspicious.

Alerts

We have had to deal with Orange Cyberdefense or Netcraft on several occasions. These companies are mandated by big companies like Société Générale, Amazon or Google and send emails to the address abuse@yourdomainname.com (if you have a domain name, make sure you always listen to this address).

When a phishing page targeting one of their customers is used with our link shortener, we receive an email asking us to remove the fraudulent link as soon as possible.

Shutting down our servers

Quickly after the email from these companies on our abuse@ address, we may receive an email from our host shortly after:

Hello,

We received a complaint today on one of your domains.

Please act within 24 hours to remove the content of your site or we will be obliged to suspend your service.

These companies have probably used the contact email provided for this purpose in the WHOIS registry. It is better to react very quickly in these cases.

What kind of illegal content?

Bar graph showing the different types of fraudulent content we encountered

We have sorted all the fraudulent pages that were captured in our URL shortener into different categories:

  • Messaging scam: a page that asks the victim to log in to check their MMS, listen to their voicemail, join a chat group or check their emails. Generally very targeted at Orange, a French ISP.
  • Games scam: offers the victim to download a game crack or cheat tool. Often targets Roblox, sometimes Fortnite or Clash of Clans.
  • GAFAM scam: asks the victim to log into their online account at a large hosting company. Most often targets Microsoft, sometimes Google or Netflix.
  • Shipping scam: informs the victim that their parcel is late or that they has to pay customs fees, and ask for their bank details.
  • Bank scam: asks the victim to log into their bank account to activate a security mechanism such as "Certicode", or to urgently check their bank details. Usually targets the Crédit Agricole, Banque Populaire, Banque Postale or PayPal.
  • Porn: offers the victim to chat with a local girl... no need to go into details. We have no rule against pornographic content in our terms of use, but the links that have been blocked seem to us to be used for spam purposes.
  • Terrorist or pedopornographic content is of course systematically blocked as soon as we are informed of the existence of such content on our service. Fortunately, for my sanity, they only represent 0.6% of the registered malicious links.

For a large portion of the links in our database, we were unable to guess their category. We based this categorization on the destination link (to take two fictitious examples, webmailorange.weebly.com or sites.google.com/postbank) or sometimes on the name given to the shortcut (consult-mms-sfr...). We did not keep any other traces allowing us to categorize these links afterwards.

Where are the fraudulent pages hosted?

Bar graph showing the different hosting types of the fraudulent content we encountered

We tried to guess the hosting used for each fraudulent page. In many cases, it is explicit (for example, if the URL starts with sites.google.com or contains wixsite.com or yolasite.com, we know that the site is hosted by Google Sites, Wix or Yola). In other cases, the fraudulent page uses its own domain name and we don't have more information about its hosting.

We tried to sort these hosts into the following categories:

  • IP address: the URL entered does not include a domain name, and therefore does not let us guess which host is used.
  • S3 bucket: the malicious person has hosted an HTML page or a PNG image on an object storage service.
  • CDN: this one is interesting − the malicious person has directly created a link to an official image of a site they want to imitate (e.g. PayPal). They then uses the shortened link in her email, or on another HTML page in an <img> tag. The victim's browser will make the request to the shortened URL, thus engaging the reputation of the shortener's domain name, and this will increment the link shortener's click counter and allow the attacker to know whether the victim has visited its page or not. However, the use of this method seems to be limited.
  • Image hoster: the hacker creates an image on which is written the information intended to trap their victim. This method is probably used to prevent the image text from being analyzed by email servers.
  • Text host: the page is hosted on a platform such as Pastebin, or other platforms that allow to write content without leaving complete freedom on the page layout and HTML/CSS (e.g. blog platforms...).
  • Web host: the page is hosted by a web host, usually free (Google Sites, Yola...) and uses the domain name of this host (sites.google.com, subdomain.yolasite.com). With some hosts like Yola or Wix, it is possible that the page is served under its own subdomain.
  • Own domain name : the malicious person seems to have bought their own domain name to host their fraudulent page. We do not know which host the fraudulent website was hosted with. Most of these domain names looks like typosquatting.
  • URL shortener: The URL entered points to a URL shortener. It is common to encounter URL shortener chains for phishing links. We do not have information about the hosting of the page behind the shortener.
  • Hijacked site: the page appears to be hosted on a website that has been hacked.

What conclusions can be drawn from this analysis?

In a large part of the cases (at least 340 cases out of 702 identified, for a total of 937 malicious links), the malicious pages are located on servers belonging to web giants that offer free web hosting:

  • At Google (Sites, Cloud, Forms, Blogspot and even Firebase) in 23% of the cases;
  • At Wix in 20% of cases;
  • At Yola in 18% of cases.

When we detect a malicious page on their platform, these large companies take more than 24 hours to process our request; phishing campaigns run very quickly and are often viewed by more than 1,000 people within a few hours. These pages remain online for weeks without these platforms noticing.

These web giants, who offer tools that facilitate the dissemination of malicious pages without applying adequate moderation measures to prevent hijacking of their platform, are the main culprits for the dissemination of malicious content on our link shortener by their laxism in the face of the dishonest, even criminal, activities served by their own platforms. It is because of their negligence that we must mobilize more than necessary.

In second place, a large number of deceptive domain names, which resemble those of official sites, are acquired by hackers (example: micrcscft.com). The Internet user, by lack of vigilance, can be fooled.

Finally, many sites administered by people who are not very scrupulous about the security of their infrastructure are hacked and used to host malicious pages.

URL shorteners are not to be outdone: we see the regular use of URL shortener chains to spread malicious links. We have therefore chosen to block the creation of links pointing to a URL shortener and try to list as many of them as possible.

How did we detect these malicious links?

The software we use, rs-short, sends us an alert when a link is visited a certain number of times in a certain time frame (e.g. 30 times in 2 hours). This leads to many false positives, but it also allows us to spot deceptive links very quickly when they are widely distributed.

When we experience a phishing wave, which - fortunately - does not happen every day, we configure our software to list every link created in a log file. This tracking mode, which we activate temporarily, allows us to act before malicious links are distributed, at the significant cost of violating the privacy of our users.

Please note that we only consult links that we consider illegitimate, and that we carry out these operations in strict compliance with our terms of use: commitment 2 - "We do not consult your personal data for any other purpose than the technical maintenance of our services [...]". If you don't like this moderation policy, feel free to switch to another instance.

Finally, we also search in the software database according to the domain name (for example: sites.google.com). Indeed, out of 79 links created on Google Sites, only 20 of them are legitimate. This figure is even worse at Yola: out of 62 links created, not one is legitimate. At Wix, only 22 links are legitimate out of 90. Our hackers have their favorite hosts.

These three methods allowed us to identify more than 99% of the malicious links we listed. The remaining 1% are reports we received on our abuse@ address. We consider that if we receive an e-mail on this box, it is already too late: the link will have already been widely distributed and the reputation of our domain name will have suffered; our hosts may already know about it.

How to block hackers?

There is no magic method to block hackers. IP-based blocking would be irrelevant because they use VPNs, hacked computers, or Tor relays that we wouldn't want to block at all, because we'd be preventing legitimate users from using our services.

Captchas have limited effectiveness. They do block automated attempts to create links, but all links created after the migration of our service to rs-short in April 2020 were likely captured by human beings. It is likely that our hackers are using click work companies, or that the hackers are entering the link directly in person.

Reporting malicious sites to the major platforms (via the abuse@ email address or other appropriate channels) is not very effective in practice, the giants may be overwhelmed with reports or understaffed, but often take several days to respond. The malicious link will have had ample time to circulate in the meantime. In addition, this requires additional work time for which we are not prepared.

We ruled out the possibility of blocking links based on the name given to the shortcut: out of the 937 registered malicious links, exactly 50% of them did not specify a particular link name, which then generates a random link name of 8 characters. For the remaining links, the name is chosen by the malicious person. It may indeed be a name related to the malicious content, although sometimes it is very generic. But even if the link name would be blocked, the hacker only has to change their IP and try again with another link name.

At the moment we use block lists. We block many of the web hosts and domain names commonly used to run phishing campaigns and are considering adding large hosts like Google Sites to this list as well if moderation takes too much effort. We have also blocked the creation of links to many of the URL shorteners (impossible to list them all).

Find out more about developing defense mechanisms on the CHATONS forum.

A collective effort is needed

With the web giants

Even if we don't expect much from big web hosts like Yola, Dynadot and Google, some moderation on their platform would be more than necessary. Another option, which they won't want to hear, is to downsize to allow smaller players to do a human scale moderation job, which will be more effective because these small platforms are not overloaded with traffic.

We have never had an illegal link to Framasite, Ouvaton or BeeHome.

With URL shorteners

Since the launch of this service, we are committed to a zero tolerance policy against illegal content on our URL shortener. But if we act alone, this will only redirect pirates to other existing URL shorteners (and there is no lack of them), especially among the CHATONS.

Fellow librarians, web hosts and webmasters who offer a free URL shortener, don't let your service go unmoderated! Don't allow hackers to take advantage of your domain name's reputation for dishonest purposes. Don't wait to receive an email on your abuse@ box to act.

If you think you won't have time to do such a moderation job, then maybe consider closing down your service or restricting it to trusted people, this could save you from serious trouble with your host or third party companies. There are so many other services to host that will require less moderation.

~ Neil