Sunday, December 20, 2015

Spotting Malicious Node Relays

TOR is a well known "software" able to protect communications dispatching packets between different relays spread over the world run by a network of volunteers. Because the high rate of anonymity TOR has been used over the past years to cover malicious actions by physical and cyber attackers. TOR, especially through its browser implementation (the TOR Browser), is also know as one of the main (by meaning of the most used) way to get access to the Dark WEB  in where "malicious" people buy and sell illegal stuff through dark markets. Each relay belonging to the network is able to decide if being an ExitPoint (in the following picture represented by the last machine contacting "Bob") or just a middle relay (in the following picture: a TOR node highlighted by "green cross") depending on its own configuration status. If the relay decides to be an ExitNode it will expose its own IP address to the public world; it's usually a good idea alert local police and used ISP about that in order to avoid penalties.


From TheTorProject.org

During the past year mass-media such as: television shows, radio stations, youtube channels, Facebook groups, etc. disclosed many dark markets address swelling up flows of curios people to the DarkWeb and consequently exposing them to numerous new attack scenarios. Indeed new attackers set up Exit Nodes or Relay Nodes in order to spy and/or compromise communication flows passing through them. The attack could happen in many single ways but the most used ones (as today writing) are mainly three:

  1. DNS Poisoning: This technique consists in redirecting DNS calls related to well-known web sites to creative fake pages containing exploit kits able to eventually compromise  user browsers.
  2. File Patching: This technique consists in altering the requested file during its way back to destination by adding malicious content to it: this happens directly on ExitPoint/Relay before being issued to original requester.
  3. Certification Substitution (SSL - MITM). This techniques consists in substitute the real web-site certificate with a fake one in order to be able to decrypt the communication flow intercepting credentials and parameters.
Working on CyberSecurity means being aware of such attacks and being able to decide whenever passing through TOR relays or not. Please be aware that TOR is not the only anonymous networks in the DarkWeb !  

 My goal was to figure out when my TOR flow was passing through malicious relays. For such a reason I decided to write a little python script able to make some quick and dirty checks such as: DNS Poison, File patching and SSL-MITM which are significant checks on the status of the used TOR circuit. The script has 2 years old and it was undisclosed until now. I decided to public it since scientific researches have been implemented an advanced version of my FindMalExit.py. Please have a read here for the full paper on that topic.

The IDEA.
Well, actually it is a pretty simple idea: "let's grab certificates, IP addresses and files without passing through TOR network (or passing through trusted circuits) and then replicate  the process passing through all available relays. Compare the results and check if somebody is chaining the "ground".

The IMPLEMENTATION:
Following please find my poor code. Please remember it is a non production code so do not use on production envronments! (This code is just a first release of a bigger project now maintained by Yoroi ). I decided to publish the code in HTML format (so I can easily comment it), if you need it in a more common way check my github repo (here


#!/usr/bin/env python2

#========================================================================#
#               THIS IS NOT A PRODUCTION RELEASED SOFTWARE               #
#========================================================================#
# Purpose of finMaliciousRelayPoints is to proof the way it's possible to#
# discover TOR malicious Relays Points. Please do not use it in          #
# any production  environment                                            #
# Author: Marco Ramilli                                                  #
# eMail: XXXXXXXX                                                        #
# WebSite: marcoramilli.blogspot.com                                     #
# Use it at your own                                                     #
#========================================================================#

#==============================Disclaimer: ==============================#
#THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY EXPRESS OR      #
#IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED          #
#WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE  #
#DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,      #
#INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES      #
#(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR      #
#SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)      #
#HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,     #
#STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING   #
#IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE      #
#POSSIBILITY OF SUCH DAMAGE.                                             #
#========================================================================#




#-------------------------------------------------------------------------
#------------------- GENERAL SECTION -------------------------------------
#-------------------------------------------------------------------------
import StringIO
import tempfile
import time
import hashlib
import traceback
from   geoip         import  geolite2
import stem.control

TRUSTED_HOP_FINGERPRINT = '379FB450010D17078B3766C2273303C358C3A442' 
#trusted hop
SOCKS_PORT              = 9050
CONNECTION_TIMEOUT      = 30  # timeout before we give up on a circuit

#-------------------------------------------------------------------------
#---------------- File Patching Section ----------------------------------
#-------------------------------------------------------------------------
import pycurl

check_files               = {
                             "http://live.sysinternals.com/psexec.exe",
                             "http://live.sysinternals.com/psexec.exe",
                             "http://live.sysinternals.com/psping.exe", }
check_files_patch_results = []

class File_Check_Results:
    """
    Analysis Results against File Patching
    """
    def __init__(self, url, filen, filepath, exitnode, found_hash):
        self.url           = url
        self.filename      = filen
        self.filepath      = filepath
        self.exitnode      = exitnode
        self.filehash      = found_hash


#------------------------------------------------------------------------
#------------------- DNS Poison Section ---------------------------------
#------------------------------------------------------------------------
import dns.resolver
import socks
import socket

check_domain_poison_results = []
domains                     = {
                                 "www.youporn.com",
                                 "youporn.com",
                                 "www.torproject.org",
                                 "www.wikileaks.org",
                                 "www.i2p2.de",
                                 "torrentfreak.com",
                                 "blockchain.info",
}

class Domain_Poison_Check:
    """
    Analysis Results against Domain Poison
    """
    def __init__(self, domain):
        self.domain  = domain
        self.address = []
        self.path    = []

    def pushAddr(self, add):
        self.address.append(add)

    def pushPath(self, path):
        self.path = path

#-----------------------------------------------------------------------
#------------------- SSL Sltrip Section --------------------------------
#-----------------------------------------------------------------------
import OpenSSL
import ssl

check_ssl_strip_results   = []
ssl_strip_monitored_urls = {
                            "www.google.com",
                            "www.microsoft.com",
                            "www.apple.com",
                            "www.bbc.com",
}

class SSL_Strip_Check:
    """
    Analysis Result against SSL Strip
    """
    def __init__(self, domain, public_key, serial_number):
        self.domain        = domain
        self.public_key    = public_key
        self.serial_number = serial_number


#----------------------------------------------------------------------
#----------------     Starting Coding   -------------------------------
#----------------------------------------------------------------------


def sslCheckOriginal():
    """
    Download the original Certificate without TOR connection
    """
    print('[+] Populating SSL for later check')
    for url in ssl_strip_monitored_urls:
        try:
            cert = ssl.get_server_certificate((str(url), 443))
            x509 = OpenSSL.crypto.load_certificate(OpenSSL.crypto.FILETYPE_PEM, cert)
            p_k  = x509.get_pubkey()
            s_n  = x509.get_serial_number()

            print('[+] Acquired Certificate: %s' % url)
            print('    |_________> serial_number %s' % s_n)
            print('    |_________> public_key %s' % p_k)

            check_ssl_strip_results.append(SSL_Strip_Check(url, p_k, s_n))

        except Exception as err:
            print('[-] Error While Acquiring certificats on setup phase !')
            traceback.print_exc()
    return time.time()


def fileCheckOriginal():
    """
    Downloading file ORIGINAL without TOR
    """

    print('[+] Populating File Hasing for later check')
    for url in check_files:
        try:
            data = query(url)
            file_name = url.split("/")[-1]
            _,tmp_file = tempfile.mkstemp(prefix="exitmap_%s_" % file_name)

            with open(tmp_file, "wb") as fd:
                fd.write(data)
                print('[+] Saving File  \"%s\".' % tmp_file)
                check_files_patch_results.append( File_Check_Results(url, file_name, tmp_file, "NO", sha512_file(tmp_file)) )
                print('[+] First Time we see the file..')
                print('    |_________> exitnode : None'       )
                print('    |_________> :url:  %s' % str(url)     )
                print('    |_________> :filePath:  %s' % str(tmp_file))
                print('    |_________> :file Hash: %s' % str(sha512_file(tmp_file)))
        except Exception as err:
                print('[-] Error ! %s' % err)
                traceback.print_exc()
                pass
    return time.time()


def resolveOriginalDomains():
    """
        Resolving DNS For original purposes
    """
    print('[+] Populating Domain Name Resolution for later check ')

    try:
        for domain in domains:
            response = dns.resolver.query(domain)
            d = Domain_Poison_Check(domain)
            print('[+] Domain: %s' % domain)
            for record in response:
                print(' |____> maps to %s.' % (record.address))
                d.pushAddr(record)
            check_domain_poison_results.append(d)
        return time.time()
    except Exception as err:
        print('[+] Exception: %s' % err)
        traceback.print_exc()
        return time.time()


def query(url):
  """
  Uses pycurl to fetch a site using the proxy on the SOCKS_PORT.
  """
  output = StringIO.StringIO()
  query = pycurl.Curl()
  query.setopt(pycurl.URL, url)
  query.setopt(pycurl.PROXY, 'localhost')
  query.setopt(pycurl.PROXYPORT, SOCKS_PORT)
  query.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5_HOSTNAME)
  query.setopt(pycurl.CONNECTTIMEOUT, CONNECTION_TIMEOUT)
  query.setopt(pycurl.WRITEFUNCTION, output.write)

  try:
    query.perform()
    return output.getvalue()
  except pycurl.error as exc:
    raise ValueError("Unable to reach %s (%s)" % (url, exc))



def scan(controller, path):
  """
  Scan Tor Relays Point to find File Patching
  """

  def attach_stream(stream):
    if stream.status == 'NEW':
      try:
        controller.attach_stream(stream.id, circuit_id)
        #print('[+] New Circuit id (%s) attached and ready to be used!' % circuit_id)
      except Exception as err:
        controller.remove_event_listener(attach_stream)
        controller.reset_conf('__LeaveStreamsUnattached')

  try:

    print('[+] Creating a New TOR circuit based on path: %s' % path)
    circuit_id = controller.new_circuit(path, await_build = True)
    controller.add_event_listener(attach_stream, stem.control.EventType.STREAM)
    controller.set_conf('__LeaveStreamsUnattached', '1')  # leave stream management to us
    start_time = time.time()

    socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 9050)
    socket.socket = socks.socket

    ip = query('http://ip.42.pl/raw')
    if ip is not None:
        country  = geolite2.lookup( str(ip) ).country
        print('\n \n')
        print('[+] Performing FilePatch,  DNS Spoofing and Certificate Checking\
              passing through --> %s (%s) \n \n' % (str(ip), str(country))  )

    time_FileCheck = fileCheck(path)
    print('[+] FileCheck took: %0.2f seconds'  % ( time_FileCheck - start_time))

    #time_CertsCheck  = certsCheck(path)
    #print('[+] CertsCheck took: %0.2f seconds' % ( time_DNSCheck - start_time))

    time_DNSCheck  = dnsCheck(path)
    print('[+] DNSCheck took: %0.2f seconds'   % ( time_DNSCheck - start_time))

  except Exception as  err:
    print('[-] Circuit creation error: %s' % path)

  return time.time() - start_time

def certsCheck(path):
    """
    SSL Strip detection
    TODO: It's still a weak control. Need to collect and to compare public_key()
    """
    print('[+] Checking Certificates')
    try:
        socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 9050)
        socket.socket = socks.socket

        for url in ssl_strip_monitored_urls:
            cert = ssl.get_server_certificate((str(url), 443))
            x509 = OpenSSL.crypto.load_certificate(OpenSSL.crypto.FILETYPE_PEM, cert)
            p_k  = x509.get_pubkey()
            s_n  = x509.get_serial_number()
            for stored_cert in check_ssl_strip_results:
                if str(url) == str(stored_cert.domain):
                    if str(stored_cert.serial_number) != str(s_n):
                        print('[+] ALERT Found SSL Strip on uri (%s) through path %s ' % (url, path))
                        break
                    else:
                        print('[+] Certificate Check seems to be OK for %s' % url)

    except Exception as err:
        print('[-] Error: %s' % err)
        traceback.print_exc()

    socket.close()
    return time.time()

def dnsCheck(path):
    """
    DNS Poisoning Check
    """
    try:
        socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 9050)
        socket.socket = socks.socket

        print('[+] Checking DNS ')
        for domain in domains:
            ipv4 = socket.gethostbyname(domain)
            for p_d in check_domain_poison_results:
                if str(p_d.domain) == str(domain):
                    found = False
                    for d_ip in p_d.address:
                        if str(ipv4) == str(d_ip):
                            found = True
                            break
                    if found == False:
                        print('[+] ALERT:DNS SPOOFING FOUND: name: %s ip: %s  (path: %s )' % (domain, ipv4, path) )
                    else:
                        print('[+] Check DNS (%s) seems to be OK' % domain)
    except Exception as err:
        print('[-] Error: %s' % err)
        traceback.print_exc()

    socket.close()
    return time.time()


def fileCheck(path):
    """
    Downloading file through TOR circuits doing the hashing
    """
    print('[+] Checking For File patching ')
    for url in check_files:
        try:
            #File Rereive
            data = query(url)
            file_name = url.split("/")[-1]
            _,tmp_file = tempfile.mkstemp(prefix="exitmap_%s_" % file_name)
            with open(tmp_file, "wb") as fd:
                fd.write(data)
                for i in check_files_patch_results:
                    if str(i.url) == str(url):
                        if str(i.filehash) != str(sha512_file(tmp_file)):
                            print('[+] ALERT File Patch FOUND !')
                            print('    | exitnode : %s' % str(i.exitnode)      )
                            print('    |_________> url: %s' % str(i.url)        )
                            print('    |_________> filePath: %s' % str(i.filepath)   )
                            print('    |_________> fileHash: %s' % str(i.filehash)   )
                            #check_files_patch_results.append( File_Check_Results(url, file_name, tmp_file, path, sha512_file(tmp_file)) )
                        else :
                            print('[+] File (%s) seems to be ok' % i.url)
                        break

        except Exception as err:
                print('[-] Error ! %s' % err)
                traceback.print_exc()
                pass
    return time.time()


def sha512_file(file_name):
    """
    Calculate SHA512 over the given file.
    """

    hash_func = hashlib.sha256()

    with open(file_name, "rb") as fd:
        hash_func.update(fd.read())

    return hash_func.hexdigest()


if __name__ == '__main__':

    start_analysis = time.time()
    print("""

  |=====================================================================|
  | Find Malicious Relay Nodes is a python script made for checking 3   |
  | unique kind of frauds such as:                                      |
  | (1) File Patching                                                   |
  | (2) DNS Poison                                                      |
  | (3) SSL Stripping (MITM SSL)                                        |
  |=====================================================================|
         """)

    print("""
  |=====================================================================|
  |                 Initialization Phase                                |
  |=====================================================================|
       """)
    dns_setup_time             = resolveOriginalDomains()
    print('[+] DNS Setup Finished: %0.2f' % (dns_setup_time - start_analysis))
    file_check_original_time   = fileCheckOriginal()
    print('[+] File Setup Finished: %0.2f' % (file_check_original_time - start_analysis))
    ssl_checking_original_time = sslCheckOriginal()
    print('[+] Acquiring Certificates  Setup Finished: %0.2f' % (ssl_checking_original_time - start_analysis))

    print("""
  |=====================================================================|
  |                 Analysis  Phase                                     |
  |=====================================================================|
          """)

    print('[+] Connecting and Fetching possible Relays ...')
    with stem.control.Controller.from_port() as controller:
      controller.authenticate()

      net_status = controller.get_network_statuses()


      for descriptor in net_status:
        try:
          fingerprint = descriptor.fingerprint

          print('[+] Selecting a New Exit Point:')
          print('[+] |_________> FingerPrint: %s ' % fingerprint)
          print('[+] |_________> Flags: %s ' % descriptor.flags)
          print('[+] |_________> Exit_Policies: %s ' % descriptor.exit_policy)

          if 'EXIT' in (flag.upper() for flag in descriptor.flags):
              print('[+] Found Exit Point. Performing Scan through EXIT: %s' % fingerprint)
              if None == descriptor.exit_policy:
                  print('[+] No Exit Policies found ... no certs checking')
                  time_taken = scan(controller, [TRUSTED_HOP_FINGERPRINT, fingerprint])
          else:
              #print('[+] Not Exit Point found. Using it as Relay passing to TRUST Exit Point')
              pass
              #time_taken = scan(controller, [fingerprint, TRUSTED_HOP_FINGERPRINT])
          #print('[+] Finished Analysis for %s finished  => %0.2f seconds' % (fingerprint, time_taken))

        except Exception as exc:
            print('[-] Exception on  FingerPrint: %s => %s' % (fingerprint, exc))
            traceback.print_exc()
            pass



The RESULTS
I am not going to publish my results since Tor Relays change over time and what I found using this script might be inaccurate and imprecise: more check must be done. Moreover it could be unpleasant charge specific relays (ergo IP, ergo owners) to be "malicious". But I am going to indorse part of results described by Philipp Winte and Stefan Lindskog published on their paper (here).

From Spoiled Onions: Exposing Malicious Tor Exit Relays (Philipp Winter and Stefan Lindskog)
Many of the found Malicious relays have been found on Russia, Turky and Hong Honk. Not every malicious relay used the both techniques to compromise flow but almost one was found. The definitive more used technique is the SSL-Strip MITM mainly used to spy over channels. Few file patching techniques were identified. This kind of attack is useful to spread Malware over the networks and together with DNS poisoning is more used to "attack" rather then to "spy".

Hope you might enjoy the script, which is quite old and will need a code refactoring session but still interesting (at least on my personal point of view).

Monday, October 26, 2015

RAI UNO: TG1 Cyber Security Speciale

Giving my contribute to the Italian public mainstream channel RAI UNO. Talking about Cyber Security, Malware and Targeted Attacks with Barbara Carfagna.

http://www.rai.tv/dl/RaiTV/programmi/media/ContentItem-c0623097-0669-4734-8dae-69772ccdf87a.html#p=

Click on the Preview (above) to watch the video from the official website.

Monday, October 19, 2015

SandBoxes personal evaluations

Understanding the "sandbox" technology is a fundamental step in Malware prevention. While it is obvious the new evasion techniques such as (but not limited to); Malware Encryption, Malware Packing, Metamorphism and Polimorfism are able to evade romantic defensive technologies such as (but not limited to) AntiVirus, Intrusion Detection and Prevention Systems, URL Filtering and Proxy, is it not obvious enough that Malware can evade sandbox too.

This post is not about evasion techniques, I've been talking and writing a lot on evasion techniques, but is about better knowing sandbox technologies.

Each SandBox implements one or more specific detection strategies which makes SandBox an unique environment. It's hard to find two SandBoxes implementing the same strategies.  My point here is to understand the technologies and later on figuring out the different strategies behind them.

If you are wondering why I decided to start from "implementation" (describing how specific sandboxes work) rather then from "foundations" (describing what are the most common ways to detect Malwares ) the answer is pretty simple: "I believe is much more interesting a bottom-up approach: starting from real technologies to reach out more general strategies". I know it's questionable.

Let me start from Anubis. ANalysis of Unknown BInarieS is one of the "oldest" Sandboxes becoming one of the most known online analysis system. Anubis decided to implements its own running device on an emulated environment consisting of a Windows XP operating system running as the guest in Qemu. The analysis is performed by monitoring the invocation of Windows API functions, as well as system service calls to the Windows Native API. Additionally, the parameters passed to these functions are examined and tracked.

CWSandbox. Is another quite famous Sandbox system. It executes the sample in a "patched" way in order to discover what it does. It executes the sample into memory by adding a function call built to monitoring the API and SysCall before and after the execution. It uses an "instrumentation" procedure to patch the bynaries.

Norman SandBox. The Norman sandbox is a dynamic malware analysis solution which executes the sample in a small controlled virtual environment which simulate Windows operating system. The simulation involve multiple OS levels (such as API, System Calls, libraries, etc) as well as the Local area Network and the external Internet connectivity. Norman supports memory protection emulation and Multi-Threading supports in order to better emulate Windows OS. It has been used to mostly detect Net Worms since it makes eavy usage of DNS, Connection resolultions etc.  It monitors the auto-start extensibility points (ASEPs) often used by Malware to achieve persistance.

Joebox. One of the first SandBox system built to live in real hardware and not on Virtual/Emulated environment. It is based on a client/server architecture in which the client runs the Suspicious sample by hooking (user mode) API, Syscall invocations, Export Address Table, and System Service Descriptor Table). It enables a Kernel driver to cloaking binary patching in order to "inject calls-debugging code". It uses AutoIT to emulate users interaction with the machine during the analysis phase.

LastLine. Developed by Anubis's creators, Lastline implements multiple techniques including Intrusion Detection System, DNS analysis and Sandbox. It does suspicious file analysis by performing a multi path technique on a fully emulated environment. While Norman implemets an OS emulation Lastline performs a CPU and Memory emulation being hidden from the entire OS (not care if Kernel mode or  user mode at this stage, since being behind the whole OS). 

ReVirt. Mainly based on virtualization environment (VBox) where a local engine logs alld the grabbed data. A centralized collector processes and analyze the produced logs. It includes a system called BackTracker that helps system administrators understand (and thereby recover from) an intrusion, by automatically identifying potential sequences of steps that occurred in an intrusion.

FireEye. It is one of the most known player of Cyber Security Solitions, it provides many products in addition to a sandbox, but we want to focus on FireEye SandBox. It is based on proprietary virtualization. They do not run samples in known VM-Managers such as: ESX, Hypervisor and VM-Ware, but they built their own VM engine in order to "avoid" the "VM-Aware" Malware.

Buster. Buster is a toolkit applied to Sanboxie running on a VirtualMachine as well as on a real hardware. Buster Sandbox Analyzer is a tool that has been designed to analyze the behaviour of processes and the changes made to system and then evaluate if they are malware suspicious. The changes made to system can be of several types: file system changes, registry changes and port changes. It does not include any anti evasion technique to mitigate the "VM-Aware" Malware.

WildFire. Powered by PaloAlto, wildfire claims to be a fully os level emulation system able to detect  and report Malware within 15 minutes from the discovery. It basically emulates the Operation System cheating the Malware and observing syscalls, API calls, File Mutex, RegKeys, and so on .

HookAnalyzer. Well, actually it's hard to define this tool a SandBox, but let me put it in this environment. It basically works by hooking the executable getting dynimic as well as stati conformation on it. It runs on physical hardware and do not monitor any level of virtualization/emulation.

Cuckoo SandBox. A simple sandbox based on Virtualization within many embedded euristics to detect "VM-aware" malware. It is implemented according to Client-Server Infrastructural paradigm. On client (VirtualMachine running potentially almost any OS ) a daemon runs on system-level. It (named cuckoomon) hooks API and Syscall and sends back to the VM manager (cuckoo result-server) the entire result set. A post processing engine elaborates the cuckoomon results inflating a DB. Eventually a webinterface shows results and hits euristics.

Out there many other SandBoxe solutions are available for the security community, for example: SandBlust (By Checkpoint), Jotti, PayloadSecurity and ThreatGrid (Cisco) are some of the most known ones. Each SandBox owns particularity in botht technical implementations and detection techniques. Even on my hart I definitely know what is the actual best solution (for the time being) considering the Malware Zoo I am observing, I wont to give a gudjement since I  known that technologies follows markets, so there is not an absolute "best in class". For such a reason consider my contribution on providing the following considerations based on logics and real experiences expressed by the following chart.


SandBox Strategies VS Comparison Properties (click to enlarge)

On the time line: Virtualization, Virtualization + Anti-detection Heuristics, Emulation OS, Hardware Emulation, Hardware Agent Base and Hardware Based Agentless are the most used strategies in nowadays SandBoxes. On the leftside hand evaluation parameters: 
  • Evasion Grade: how easy it could be, for a Malware, to detect the SandBox system. Higher this value easier is the evasion.
  • Analysis Depth: how deep is the SandBox Analysis due to the owned data. Much more "raw" data is involved much more deep analysis is possible, much high is the complexity.
  • Solution Complexity: the complexity of the SandBox solution. Higher is the complexity higher is the bug rate (not feedback chain in the evasion technique has been considered so far).
  • Realization: considering my current experiences on Malware detection and evasion technologies, the state of the art I see on the implementation phase grouped by detection strategy.
As you might observe there is really not a best solution right now. SandBoxes strategies with low detection rate are the one with the highest complexity and the lower realization rate. SandBoxes stategies with the highest detection rate could be adopt heuristics to try to decrease it (once the sample has been positive analyzed, ergo time consuming) but own the lowest complexity rate and highest realization rate.

Depending on the involved "business" you might need to decide the right sandbox in the right time frame. This activity includes (but not limited to) organization analysis, threat detection, asset risks, past attacks and underground economy knowledge. I hope this post will help you out to decide the right sandbox for your business.



Monday, September 21, 2015

MalwareStats.org: New "Speed" and New Samples Available now.

Hello everybody, today is about speed improvements and new malware samples in malwarestats.org. If you followed the MalwareStats.org genesys you might remeber the early stage development where took between 8 to 10 minutes to visualize statistics over 43k Malware Analysis. Today it runs much better alost 15 seconds to visualize 76.2K Malware Analysis (ok, I know.. it really depends on Network speed and Computation power... but tested on the same machine you might experience a hug performance gap).

Let me just remind you what MalwareStats.org is about:
"The continued growth in number and in complexity of malware is a well established fact. Malwares are no longer simple pieces of code that rely on unsuspecting users to spread and thrive. They can change, adapt and hide themselves from analysts, using very sophisticated techniques. Static analysis is complex and time consuming, and it could be difficult to deduce every possible malicious behaviour, yet it is often very effective because it hinders the capability of malware to detect the analysis environment.  The purpose of MalwareStats.org is to provide valuable assistance to the phase of static analysis, supporting analysts in their exploration of code features, by letting them make more focused, statistically motivated and structured decisions."
We are facing a "Big Data" problem. Thousands of samples produce Hundred Thousands of results, which end up to be Giga Bytes of well structured Text. And.. yes, I want to make general tatistics so far (general !== from "time frame defined") so I am not interested on filtering data (well..I know I will end up putting a time filter on the main page.. but not today!). My main goal is to answer in the quickest way to such a questions: " What are the most used packers ?" or "What are the most used evasion techniques?" or again "What are the most used API or Anti-Debbugging Techniques?" and so on and so forth. Obviusly I want to give such statistics by using a simple and intuitive web interface. You might wonder why those questions are so important for me !? Well, because they really drive my decisions during a romantic Malware analysis.

The following image shows the today stats on MalwareStats.org:

MalwareStats.org detail


In order to provide a fast and reilable web visualization user interface I've tried several algorithms and several frameworks but my best choice (so far)  has to approached the problem using the Javascript "Web Workers" (HTML5).

MalwareStats.org total samples.


From W3C School :
A web worker is a JavaScript that runs in the background, independently of other scripts, without affecting the performance of the page. You can continue to do whatever you want: clicking, selecting things, etc., while the web worker runs in the background.

 The new and simple algorithm (which is not the best I can create and it is not remarkable in any point but it made a huge improvement) which made possible the huge visualization improvement from the last two versions is available here.  The following image shows the principal code function responsible to build the output, before passing it to google graphs.

Simple Visualization Algorithm
 As you might agree with me the entire code should be protected (which is not protected on undefinition, null pointers, etc..) and even improved in speed introducing multiple web workers. If you like to be involved in that project just drop me an email, any suggestion is welcomed as well. Enjoy the new results !

Thursday, September 3, 2015

Shifu: A new interesting Banking Trojan

Hello everybody, today I'd like to share some infos on "Shifu" a new incredibly interesting banking trojan. At this point you might think:
"Why are you writing about Shifu among many other new threats (even more discussed)  out there ? "
Well... Shifu is a new banking trojan which actually attacks Japanese banks mostly,  it's actually well geo-localized and probably it will end up on a specific amount of organizations, but what fascinates me is the way it implements many features by copying what have done so far some of the "best in class" known Malware. Shifu implements the following features:
  • Domain Generation Algorithm (DGA): Shifu uses the Shiz Trojan’s DGA. The exposed algorithm itself is easy to find online, and the developers behind Shifu have elected to use it for the generation of random domain names for covert botnet communications. 
  • Theft From Bank Apps: Theft of passwords, authentication token files, user certificate keys and sensitive data from Java applets is one of Shifu’s principal mechanisms. This type of modus operandi is familiar from Corcow’s and Shiz’s codes. Both Trojans used these mechanisms to target the banking applications of Russia- and Ukraine-based banks. Shifu, too, targets Russian banks as part of its target list in addition to Japanese banks.
  •  Anti-Sec: Shifu’s string obfuscation and anti-research techniques were taken from Zeus VM (in its Chtonik/Maple variation), including anti-VM and the disabling of security tools and sandboxes. 
  • Stealth: Part of Shifu’s stealth techniques are unique to the Gozi/ISFB Trojan, and Shifu uses Gozi’s exact same command execution scheme to hide itself in the Windows file system.
  • Config: The Shifu Trojan is operated with a configuration file written in XML format — not a common format for Trojans, and similar to the Dridex Trojan’s configuration (Dridex is a Bugat offspring). 
  • Wipe System Restore: Shifu wipes the local System Restore point on infected machines in a similar way to the Conficker worm, which was popular in 2009. 
  • Commuication protocol: Shifu implements an SSL communication layer based on a Self-signed certificate. The implemented module reminds analysts to the one used on Dyre Trojan campains in Late 2015.
Another interesting feature is about Point Of Sales. To make matters worse, Shifu searches for specific POS memory strings (and processes). If it finds a POS trace it starts a "stealing credit card numbers" procedure.



Last but not least Shifu makes sure none else will own the attacked system. Once it gets installed on the victim machine is starts an "AV" procedure (forgive me, is not actually an AV procedure, but it makes the idea) which locates "suspicious" files and  denies their installation. According to IBM Security Intelligence's report (here) the Malware is likely developed by a Russian group.

Let's get dirty hands on it performing basics Reverse Engineering actions to see what are the real countermeasures it adopts.  From the IBM Report (linked abouve) you may find the Malware signature (NmE5ZDRhMzIzOTg3NDg5YzhlOGI1NTc2ZjY3YjJjOTQ) which can be used into common online SandBox systems to look for samples. As you might observe the sample I've got implemets some anti-debugging techniques as well as some basic SandBox evasion techniques (for more information please have a look to malwarestats):

GetLastError, IsDebuggerPresent, GetVolumeInformations, etc..
 An interesting sequences of API calls were found: GetProcessAddress  (Retrieve the address of of an exported function or variable from the specified dynamic-link library) -- VirtualProtect (stack) (Changes the protection on a region of committed pages in the virtual address space of the calling process.) -- VirtualAlloc (Reserves, commits, or changes the state of a region of pages in the virtual address space of the calling process. Memory allocated by this function is automatically initialized to zero.) -- Sleep (Suspends the execution of the current thread until the time-out interval elapses.) -- VirtualAlloc -- 


Another interesting pattern found during the simple static analysis performed phase (showed on the following image) is the dynamically loaded Library pattern (previous downloaded).  As you may observe on row 2861 the system points out to a specific location and call LoadLibraryA to load it into memory.

Dynamically Loaded DLL
Dynamic Analysis clearly shows Sample's RAT features by spawning a shell (on my machine PID: 1388 within Parent PID: 788 owning to the executed Sample ) and executing commands. Unfortunately the evasion techniques detected the SandBox execution. The following image shows the check of Python presence, which often is one of the detection mechanisms (How many common users have Python on their Windows Machines ? Not much, really).

Python Detection

After a simple de-obfuscation round (Visual C Packer was detected) the analyst could appreciate the command line parser. Probably the one used to communicate through Command and Control (not much further analysis has been performed)

Command Line Parser
Network wise the sample embeds the following addresses:
  • download.windowsupdate.com (191.234.4.50). Noisy maker
  • eboduftazce-ru.com (188.42.254.65). Much more interesting because geolocalized in China and the domain has changed at least two servers during the last year.
 
WhoIS
A simple nmap scan on it shows up-and-running a nginx server on both ports 80 and 443, used to comunicate to Malware and a ssh daemon active on standard port and and an interesting port 53 TCP opened. Statically analized behaviour presents the following TimeLine (click on it to enlarge):

Behaviour Time Line
Not really a significant one but the cmd.exe spawned feels like an hero. Concluding my post I wanted to impress on my pages this significant piece of Malware which embeds many different techniques borrowed from many older Malware underlining a new Malware writers skill sets, able to make harder and harder piece of code as their wish (just by adding feature from different Malwares).