The detection rates of anti-malware and antivirus scanners varies considerably. Knowing how to manually scan for and remove malware is an important and useful skill with which to confirm a scanner's effectiveness or compensate for its failings. In this article, Andrey Kucherov, Malware Analyst at Imunify360, describes some essential manual website malware detection and cleanup techniques.
Introduction
The reality of modern security creates new challenges for web hosts every day. It is well known that there is no absolute protection that guarantees a 0% chance of your website being hacked. Even major players in online markets suffer from security breaches, and, from time to time, make users change their passwords "just in case". Usually, after getting into a victim's host, perhaps using some zero-day software vulnerability, brute-force attacks on weak passwords, or via infected "neighbor" sites, attackers will try to strengthen their positions by injecting additional malware into system folders. They may use malware such as web-shells or hidden uploaders, or they will attempt to insert 'backdoors' into a CMS's core files or into the website database.
Different malware scanners can help you identify uploaded web shells, backdoor, phishing content, spam mailers and other types of malicious software—i.e., everything already known and encountered before. Some scanners, like ImunifyAV, have heuristic rule-sets that allow them to identify files containing suspicious code, with signatures matching those used by malicious scripts. They can also identify files with suspicious attributes that may have been uploaded by hackers. Unfortunately, even when using an antivirus scanner on your host after becoming infected, you may still be in a situation where not all malicious software has been identified. That means that intruders still have a 'back door' to your system, and can get there again any time they want.
A modern hacker's scripts differ significantly from those that existed four or five years ago. Now, malware developers combine multiple techniques, like code obfuscation, encryption, decomposition, and external loading. They use many methods to bypass even the best antivirus software, so there is always a chance there might be something left behind on a server once they have departed. It may seem paranoid but probably you can't survive in the current security reality without some degree of healthy paranoia on your side.
Example of a file containing malicious code
So, what you can do to effectively secure your website? A comprehensive approach must be used: initial automated malware scans must be followed by a manual check. In this article I will talk about how to identify malicious software without using malware scanners.
Types of malware
First, let's review what we are going to be looking for:
1. Hacker Scripts
Very often during an attack there are a number of files of a certain type uploaded onto a victim's system. These may be web shells (e.g. c99.php), backdoors, file uploaders, spam mailers, phishing pages, doorways (web pages that are created for the deliberate manipulation of search engine indexes), or defacement content (for example, the hacker's logo, obscene messages, links, etc.). In some cases you can simply search the name of the suspicious file on the web to find out what it does—script kiddies usually do not bother modifying files much so it will probably turn up in search results.
2. Code Injection
Code injection: A popular method of malware deployment onto a target system is via injections: malicious code can be injected into the .htaccess file to create SEO and mobile redirections; PHP or Perl script injections can be used to create backdoors; malvertising scripts can be injected into static .js (JavaScript) and .html files; and, very often, it can be a combination of injection into an existing file together with the uploading of a command and control script. For example, malicious code can be injected into the exif-header of a .jpg file, and the code can be triggered and executed by some other benign-looking file uploaded in another part of the website.
3. SQL Injection
Database entries are a frequent target for hacker attacks. Static HTML content injections are possible using tags such as <script>, <iframe>, <embed>, or <object>. Such unauthorized code insertions can redirect visitors to related but unaffiliated sites, embed advertisements from which the site owner does not profit, embed mining trojans (e.g. CoinHive JavaScript miner), or spy on users and infect their computers using drive-by attacks. Besides this, many modern CMSs (e.g. IPB, vBulletin, modx and others) use template processors that allow the execution of PHP code, and the templates themselves are stored in the database. This gives attackers the opportunity to add backdoors and webshells to the website template directly in the database itself.
4. Cache Injections
Due to insecure settings of a caching server, for example, when using memcached, some injections can be done on cached data on the fly. In some cases, spam can be injected into website pages without actually hacking the core functionality of a website.
If hackers are able to get privileged (root) access to a server, they can replace some web server components or caching server components with infected versions. Such a web server can then be controlled via remote commands, and it can add dynamic redirects and malicious code to different website pages. As with cache injections, a webmaster is usually not able to spot the infection because all user files and databases appear unaffected. This is the most difficult case, and in some situations it is easier to rebuild the server and migrate user data rather than try to detect all the malware.
5. System components replacement
By now, I'll assume that you've already checked the files and database dump with AV scanners and that they have not identified anything malicious. If the malicious redirect or script (embedded in the <script> tag) is still somewhere on the pages of your website, redirects will continue sending users to malicious websites.
How should you proceed? Read on to find out.
Manual search cookbook
Make sure you have a valid backup of all your data before performing any manual cleanup steps on your production host.
In Linux and some Unix-like systems, it is hard to find more useful commands for searching files than find and grep.
This command will look for all files that were modified in the past week.
find . -name '*.ph*' -mtime -7
Sometimes, attackers change file modification dates to avoid detection. In this case, you can use the following command to look for .php and .phtml files that have had their attributes changed.
find . -name '*.ph*' -ctime -7
If you need to look for file changes in a certain time frame, you can also use this find command.
find . -name '*.ph*' -newermt 2015-01-25 ! -newermt 2015-01-30 -ls
And let's not underestimate the grep command as well. This command can recursively search for certain patterns in the files, drilling down through all folders and files. Here is an example.
grep -ril 'example.com/google-analytics/jquery-1.6.5.min.js' *
When your web server is compromised, it is good practice to check files with the guid/suid flag, just to be safe.
sudo find / -perm -4000 -o -perm -2000
Finally, you can use a command like this to identify what PHP scripts are currently running in the background and possibly impacting website performance.
lsof +r 1 -p `ps axww | grep httpd | grep -v grep | awk '{ if(!str) { str=$1 } else { str=str","}} END{print str}'` | grep vhosts | grep php
Malicious code analysis
Now that we know how to search for possibly malicious files, let's dive a bit deeper and list what exactly we are looking for and where.
1. Check the upload, cache, tmp, backup, log, and images directories.
You need to check all directories that are used for file uploading. For example, with Joomla you should look for .php files in the ./images folder. There is a high chance that if you find something, it will be malicious. For WordPress it is worth checking the wp-content/uploads, backup and theme cache directories.
find ./images -name '*.ph*'
2. Looking for files with weird names
Here are examples of strange file names to look out for: php, fyi.php, n2fd2.php. You should also look for unusual patterns in file names. For example:
File names comprising an odd and unreadable mixture of letters, numbers and symbols, e.g. srrfwz.php, ath.php, kirill.php, b374k.php.php, tryag.php.
Because many users rename files by appending the digit '1', look out for normal-looking file names that append numbers other than 1 to filename parts, e.g. index9.php, wp3-login.php.
3. Looking for files with unusual extensions.
Let's suppose you have a website based on the WordPress CMS. Files with extensions such as .py, .rb, .pl, .cgi, .so, .c, .phtml, or .php3 would be unusual for such types of websites. If scripts or files with these extensions are found, most probably these are hacker tools. There is a chance of a false alarm, but it is low. 4. Looking for files with non-standard attributes or creation dates.
As mentioned above, files with modified attributes are suspicious. For example, if all the .php files that were uploaded to the server via FTP/SFTP have the owner file attribute set to 'user', and you also see a number of files having this attribute set to 'www-data', then it is worth checking the latter. Also, check if the script creation date is earlier than the website creation date. You can use the command templates from the cookbook section to bake up your own search queries, and speed up or optimize them. 5. Looking for doorways using a large number of .html or .php files.
If a directory contains a few thousand .php or .html files, there is a high chance that this is a doorway. You can use the following command to find the top 50 directories with the highest file counts (if you have many hosting accounts and files on the server, it is better to use this command in the specific folder/home directory you would like to check to save some execution time):
find ./ -xdev -type d -print0 | while IFS= read -d '' dir; do echo "$(find "$dir" -maxdepth 1 -print0 | grep -zc .) $dir"; done | sort -rn | head -50
Server logs can help
- Dependent relations between the date and time when an email was sent (using details in the mail log and email message header), and access_log entries, help to determine how mail spam was sent out, and find the mailer script on the server.
- FTP xferlog analysis helps to identify which files were uploaded or changed during the attack and by whom.
- If your mail server and PHP settings are correctly configured you can find the name of the sender PHP script and the full path to it in your mail log or in the full email message header. This helps to quickly find and eliminate the source of spammy deliveries.
- Some modern CMSs and plugins have more advanced defense techniques to proactively detect cyber attacks. Their logs might show if there was any attack and whether the CMS or plugin was able to protect itself or not.
- The access_log and error_log files also allow you to track a hacker's actions, if you were able to identify the script names that were used, the IP address or the HTTP user agent (User-Agent). You can also check the POST request on the day the attack happened. Often, such checks allow you to find which malicious files were uploaded and which were already present before the attack.
Integrity checks
It is much easier to analyze attack vectors and look for malware scripts on websites if some security precautions were already made beforehand. Integrity checks help to identify the changes on the file system in a timely manner, and detect malicious activities quickly.
The easiest and most effective way to perform such checks is by using version control systems such as git, SVN, or CVS. For example, with git, if you correctly configure the .gitignore file, the process of integrity checking comes down to executing two commands:
git status # check all changed files
git diff # find malicious code
This guarantees that you have a backup copy of your files, and allows you to quickly restore the website to a previous state. Experienced server administrators can also use inotify, tripwire, auditd and similar tools to track file and folder access and changes.
Unfortunately, it is not always possible to configure the version control system or any site integrity check services on a server. In the case of shared hosting, it is not possible to use a version control system or system services. To overcome this problem you can use CMS extensions, in the form of a plugin or a stand-alone script, to track file changes. Some CMSs (e.g. Bitrix or DLE) already have built-in integrity checks.
If the website is using custom scripts or is built with static HTML files, you can use the following shell command to make a snapshot of currently stored files:
ls -lahR > original_file.txt
If any malware threats occur you can create another snapshot and then compare them using any comparison software you like, for example, WinDiff, AraxisMerge Tool, BeyondCompare, the diff command (on Linux) or even compare snapshots online.
Conclusion
In cases where antivirus solutions fail to clean a hacked website, it is useful to know how to do it yourself on the command line, with heuristics, and using built-in OS and CMS tools and features. Even with high-rate detection malware scanners such as ImunifyAV, being able to confirm the detection rates manually is an important, confidence-building skill.
ImunifyAV is the completely free antivirus and anti-malware scanner. Upgrade to ImunifyAV+ to access the built-in, one-click, fully automated cleanup feature, or get it as part of Imunify360's complete and comprehensive website security solution, which includes an intelligent WAF, IDS and IPS, Proactive Defense, automated kernel patch management and more.