Archive

Archive for the ‘Microsoft Detection and Response Team (DART)’ Category

Success in security: reining in entropy

May 20th, 2020 No comments

Your network is unique. It’s a living, breathing system evolving over time. Data is created. Data is processed. Data is accessed. Data is manipulated. Data can be forgotten. The applications and users performing these actions are all unique parts of the system, adding degrees of disorder and entropy to your operating environment. No two networks on the planet are exactly the same, even if they operate within the same industry, utilize the exact same applications, and even hire workers from one another. In fact, the only attribute your network may share with another network is simply how unique they are from one another.

If we follow the analogy of an organization or network as a living being, it’s logical to drill down deeper, into the individual computers, applications, and users that function as cells within our organism. Each cell is unique in how it’s configured, how it operates, the knowledge or data it brings to the network, and even the vulnerabilities each piece carries with it. It’s important to note that cancer begins at the cellular level and can ultimately bring down the entire system. But where incident response and recovery are accounted for, the greater the level of entropy and chaos across a system, the more difficult it becomes to locate potentially harmful entities. Incident Response is about locating the source of cancer in a system in an effort to remove it and make the system healthy once more.

Let’s take the human body for example. A body that remains at rest 8-10 hours a day, working from a chair in front of a computer, and with very little physical activity, will start to develop health issues. The longer the body remains in this state, the further it drifts from an ideal state, and small problems begin to manifest. Perhaps it’s diabetes. Maybe it’s high blood pressure. Or it could be weight gain creating fatigue within the joints and muscles of the body. Your network is similar to the body. The longer we leave the network unattended, the more it will drift from an ideal state to a state where small problems begin to manifest, putting the entire system at risk.

Why is this important? Let’s consider an incident response process where a network has been compromised. As a responder and investigator, we want to discover what has happened, what the cause was, what the damage is, and determine how best we can fix the issue and get back on the road to a healthy state. This entails looking for clues or anomalies; things that stand out from the normal background noise of an operating network. In essence, let’s identify what’s truly unique in the system, and drill down on those items. Are we able to identify cancerous cells because they look and act so differently from the vast majority of the other healthy cells?

Consider a medium-size organization with 5,000 computer systems. Last week, the organization was notified by a law enforcement agency that customer data was discovered on the dark web, dated from two weeks ago. We start our investigation on the date we know the data likely left the network. What computer systems hold that data? What users have access to those systems? What windows of time are normal for those users to interact with the system? What processes or services are running on those systems? Forensically we want to know what system was impacted, who was logging in to the system around the timeframe in question, what actions were performed, where those logins came from, and whether there are any unique indicators. Unique indicators are items that stand out from the normal operating environment. Unique users, system interaction times, protocols, binary files, data files, services, registry keys, and configurations (such as rogue registry keys).

Our investigation reveals a unique service running on a member server with SQL Server. In fact, analysis shows that service has an autostart entry in the registry and starts the service from a file in the c:\windows\perflogs directory, which is an unusual location for an autostart, every time the system is rebooted. We haven’t seen this service before, so we investigate against all the systems on the network to locate other instances of the registry startup key or the binary files we’ve identified. Out of 5,000 systems, we locate these pieces of evidence on only three systems, one of which is a Domain Controller.

This process of identifying what is unique allows our investigative team to highlight the systems, users, and data at risk during a compromise. It also helps us potentially identify the source of attacks, what data may have been pilfered, and foreign Internet computers calling the shots and allowing access to the environment. Additionally, any recovery efforts will require this information to be successful.

This all sounds like common sense, so why cover it here? Remember we discussed how unique your network is, and how there are no other systems exactly like it elsewhere in the world? That means every investigative process into a network compromise is also unique, even if the same attack vector is being used to attack multiple organizational entities. We want to provide the best foundation for a secure environment and the investigative process, now, while we’re not in the middle of an active investigation.

The unique nature of a system isn’t inherently a bad thing. Your network can be unique from other networks. In many cases, it may even provide a strategic advantage over your competitors. Where we run afoul of security best practice is when we allow too much entropy to build upon the network, losing the ability to differentiate “normal” from “abnormal.” In short, will we be able to easily locate the evidence of a compromise because it stands out from the rest of the network, or are we hunting for the proverbial needle in a haystack? Clues related to a system compromise don’t stand out if everything we look at appears abnormal. This can exacerbate an already tense response situation, extending the timeframe for investigation and dramatically increasing the costs required to return to a trusted operating state.

To tie this back to our human body analogy, when a breathing problem appears, we need to be able to understand whether this is new, or whether it’s something we already know about, such as asthma. It’s much more difficult to correctly identify and recover from a problem if it blends in with the background noise, such as difficulty breathing because of air quality, lack of exercise, smoking, or allergies. You can’t know what’s unique if you don’t already know what’s normal or healthy.

To counter this problem, we pre-emptively bring the background noise on the network to a manageable level. All systems move towards entropy unless acted upon. We must put energy into the security process to counter the growth of entropy, which would otherwise exponentially complicate our security problem set. Standardization and control are the keys here. If we limit what users can install on their systems, we quickly notice when an untrusted application is being installed. If it’s against policy for a Domain Administrator to log in to Tier 2 workstations, then any attempts to do this will stand out. If it’s unusual for Domain Controllers to create outgoing web traffic, then it stands out when this occurs or is attempted.

Centralize the security process. Enable that process. Standardize security configuration, monitoring, and expectations across the organization. Enforce those standards. Enforce the tenet of least privilege across all user levels. Understand your ingress and egress network traffic patterns, and when those are allowed or blocked.

In the end, your success in investigating and responding to inevitable security incidents depends on what your organization does on the network today, not during an active investigation. By reducing entropy on your network and defining what “normal” looks like, you’ll be better prepared to quickly identify questionable activity on your network and respond appropriately. Bear in mind that security is a continuous process and should not stop. The longer we ignore the security problem, the further the state of the network will drift from “standardized and controlled” back into disorder and entropy. And the further we sit from that state of normal, the more difficult and time consuming it will be to bring our network back to a trusted operating environment in the event of an incident or compromise.

The post Success in security: reining in entropy appeared first on Microsoft Security.

Lessons learned from the Microsoft SOC—Part 3c: A day in the life part 2

May 5th, 2020 No comments

This is the sixth blog in the Lessons learned from the Microsoft SOC series designed to share our approach and experience from the front lines of our security operations center (SOC) protecting Microsoft and our Detection and Response Team (DART) helping our customers with their incidents. For a visual depiction of our SOC philosophy, download our Minutes Matter poster.

COVID-19 and the SOC

Before we conclude the day in the life, we thought we would share an analyst’s eye view of the impact of COVID-19. Our analysts are mostly working from home now and our cloud based tooling approach enabled this transition to go pretty smoothly. The differences in attacks we have seen are mostly in the early stages of an attack with phishing lures designed to exploit emotions related to the current pandemic and increased focus on home firewalls and routers (using techniques like RDP brute-forcing attempts and DNS poisoning—more here). The attack techniques they attempt to employ after that are fairly consistent with what they were doing before.

A day in the life—remediation

When we last left our heroes in the previous entry, our analyst had built a timeline of the potential adversary attack operation. Of course, knowing what happened doesn’t actually stop the adversary or reduce organizational risk, so let’s remediate this attack!

  1. Decide and act—As the analyst develops a high enough level of confidence that they understand the story and scope of the attack, they quickly shift to planning and executing cleanup actions. While this appears as a separate step in this particular description, our analysts often execute on cleanup operations as they find them.

Big Bang or clean as you go?

Depending on the nature and scope of the attack, analysts may clean up attacker artifacts as they go (emails, hosts, identities) or they may build a list of compromised resources to clean up all at once (Big Bang)

  • Clean as you go—For most typical incidents that are detected early in the attack operation, analysts quickly clean up the artifacts as we find them. This rapidly puts the adversary at a disadvantage and prevents them from moving forward with the next stage of their attack.
  • Prepare for a Big Bang—This approach is appropriate for a scenario where an adversary has already “settled in” and established redundant access mechanisms to the environment (frequently seen in incidents investigated by our Detection and Response Team (DART) at customers). In this case, analysts should avoid tipping off the adversary until full discovery of all attacker presence is discovered as surprise can help with fully disrupting their operation. We have learned that partial remediation often tips off an adversary, which gives them a chance to react and rapidly make the incident worse (spread further, change access methods to evade detection, inflict damage/destruction for revenge, cover their tracks, etc.).Note that cleaning up phishing and malicious emails can often be done without tipping off the adversary, but cleaning up host malware and reclaiming control of accounts has a high chance of tipping off the adversary.

These are not easy decisions to make and we have found no substitute for experience in making these judgement calls. The collaborative work environment and culture we have built in our SOC helps immensely as our analysts can tap into each other’s experience to help making these tough calls.

The specific response steps are very dependent on the nature of the attack, but the most common procedures used by our analysts include:

  • Client endpoints—SOC analysts can isolate a computer and contact the user directly (or IT operations/helpdesk) to have them initiate a reinstallation procedure.
  • Server or applications—SOC analysts typically work with IT operations and/or application owners to arrange rapid remediation of these resources.
  • User accounts—We typically reclaim control of these by disabling the account and resetting password for compromised accounts (though these procedures are evolving as a large amount of our users are mostly passwordless using Windows Hello or another form of MFA). Our analysts also explicitly expire all authentication tokens for the user with Microsoft Cloud App Security.
    Analysts also review the multi-factor phone number and device enrollment to ensure it hasn’t been hijacked (often contacting the user), and reset this information as needed.
  • Service Accounts—Because of the high risk of service/business impact, SOC analysts work with the service account owner of record (falling back on IT operations as needed) to arrange rapid remediation of these resources.
  • Emails—The attack/phishing emails are deleted (and sometimes cleared to prevent recovering of deleted emails), but we always save a copy of original email in the case notes for later search and analysis (headers, content, scripts/attachments, etc.).
  • Other—Custom actions can also be executed based on the nature of the attack such as revoking application tokens, reconfiguring servers and services, and more.

Automation and integration for the win

It’s hard to overstate the value of integrated tools and process automation as these bring so many benefits—improving the analysts daily experience and improving the SOC’s ability to reduce organizational risk.

  • Analysts spend less time on each incident, reducing the attacker’s time to operation—measured by mean time to remediate (MTTR).
  • Analysts aren’t bogged down in manual administrative tasks so they can react quickly to new detections (reducing mean time to acknowledge—MTTA).
  • Analysts have more time to engage in proactive activities that both reduce organization risk and increase morale by keeping them focused on the mission.

Our SOC has a long history of developing our own automation and scripts to make analysts lives easier by a dedicated automation team in our SOC. Because custom automation requires ongoing maintenance and support, we are constantly looking for ways to shift automation and integration to capabilities provided by Microsoft engineering teams (which also benefits our customers). While still early in this journey, this approach typically improves the analyst experience and reduces maintenance effort and challenges.

This is a complex topic that could fill many blogs, but this takes two main forms:

  • Integrated toolsets save analysts manual effort during incidents by allowing them to easily navigate multiple tools and datasets. Our SOC relies heavily on the integration of Microsoft Threat Protection (MTP) tools for this experience, which also saves the automation team from writing and supporting custom integration for this.
  • Automation and orchestration capabilities reduce manual analyst work by automating repetitive tasks and orchestrating actions between different tools. Our SOC currently relies on an advanced custom SOAR platform and is actively working with our engineering teams (MTP’s AutoIR capability and Azure Sentinel SOAR) on how to shift our learnings and workload onto those capabilities.

After the attacker operation has been fully disrupted, the analyst marks the case as remediated, which is the timestamp signaling the end of MTTR measurement (which started when the analyst began the active investigation in step 2 of the previous blog).

While having a security incident is bad, having the same incident repeated multiple times is much worse.

  1. Post-incident cleanup—Because lessons aren’t actually “learned” unless they change future actions, our analysts always integrate any useful information learned from the investigation back into our systems. Analysts capture these learnings so that we avoid repeating manual work in the future and can rapidly see connections between past and future incidents by the same threat actors. This can take a number of forms, but common procedures include:
    • Indicators of Compromise (IoCs)—Our analysts record any applicable IoCs such as file hashes, malicious IP addresses, and email attributes into our threat intelligence systems so that our SOC (and all customers) can benefit from these learnings.
    • Unknown or unpatched vulnerabilities—Our analysts can initiate processes to ensure that missing security patches are applied, misconfigurations are corrected, and vendors (including Microsoft) are informed of “zero day” vulnerabilities so that they can create security patches for them.
    • Internal actions such as enabling logging on assets and adding or changing security controls. 

Continuous improvement

So the adversary has now been kicked out of the environment and their current operation poses no further risk. Is this the end? Will they retire and open a cupcake bakery or auto repair shop? Not likely after just one failure, but we can consistently disrupt their successes by increasing the cost of attack and reducing the return, which will deter more and more attacks over time. For now, we must assume that adversaries will try to learn from what happened on this attack and try again with fresh ideas and tools.

Because of this, our analysts also focus on learning from each incident to improve their skills, processes, and tooling. This continuous improvement occurs through many informal and formal processes ranging from formal case reviews to casual conversations where they tell the stories of incidents and interesting observations.

As caseload allows, the investigation team also hunts proactively for adversaries when they are not on shift, which helps them stay sharp and grow their skills.

This closes our virtual shift visit for the investigation team. Join us next time as we shift to our Threat hunting team (a.k.a. Tier 3) and get some hard won advice and lessons learned.

…until then, share and enjoy!

P.S. If you are looking for more information on the SOC and other cybersecurity topics, check out previous entries in the series (Part 1 | Part 2a | Part 2b | Part 3a | Part 3b), Mark’s List (https://aka.ms/markslist), and our new security documentation site—https://aka.ms/securtydocs. Be sure to bookmark the Security blog to keep up with our expert coverage on security matters. Also, follow us at @MSFTSecurity for the latest news and updates on cybersecurity. Or reach out to Mark on LinkedIn or Twitter.

The post Lessons learned from the Microsoft SOC—Part 3c: A day in the life part 2 appeared first on Microsoft Security.

Defending the power grid against supply chain attacks: Part 3 – Risk management strategies for the utilities industry

April 22nd, 2020 No comments

Over the last fifteen years, attacks against critical infrastructure (figure1) have steadily increased in both volume and sophistication. Because of the strategic importance of this industry to national security and economic stability, these organizations are targeted by sophisticated, patient, and well-funded adversaries.  Adversaries often target the utility supply chain to insert malware into devices destined for the power grid. As modern infrastructure becomes more reliant on connected devices, the power industry must continue to come together to improve security at every step of the process.

Aerial view of port and freeways leading to downtown Singapore.

Figure 1: Increased attacks on critical infrastructure

This is the third and final post in the “Defending the power grid against supply chain attacks” series. In the first blog I described the nature of the risk. Last month I outlined how utility suppliers can better secure the devices they manufacture. Today’s advice is directed at the utilities. There are actions you can take as individual companies and as an industry to reduce risk.

Implement operational technology security best practices

According to Verizon’s 2019 Data Breach Investigations Report, 80 percent of hacking-related breaches are the result of weak or compromised passwords. If you haven’t implemented multi-factor authentication (MFA) for all your user accounts, make it a priority. MFA can significantly reduce the likelihood that a user with a stolen password can access your company assets. I also recommend you take these additional steps to protect administrator accounts:

  • Separate administrative accounts from the accounts that IT professionals use to conduct routine business. While administrators are answering emails or conducting other productivity tasks, they may be targeted by a phishing campaign. You don’t want them signed into a privileged account when this happens.
  • Apply just-in-time privileges to your administrator accounts. Just-in-time privileges require that administrators only sign into a privileged account when they need to perform a specific administrative task. These sign-ins go through an approval process and have a time limit. This will reduce the possibility that someone is unnecessarily signed into an administrative account.

 

Image 2

Figure 2: A “blue” path depicts how a standard user account is used for non-privileged access to resources like email and web browsing and day-to-day work. A “red” path shows how privileged access occurs on a hardened device to reduce the risk of phishing and other web and email attacks. 

  • You also don’t want the occasional security mistake like clicking on a link when administrators are tired or distracted to compromise the workstation that has direct access to these critical systems.  Set up privileged access workstations for administrative work. A privileged access workstation provides a dedicated operating system with the strongest security controls for sensitive tasks. This protects these activities and accounts from the internet. To encourage administrators to follow security practices, make sure they have easy access to a standard workstation for other more routine tasks.

The following security best practices will also reduce your risk:

  • Whitelist approved applications. Define the list of software applications and executables that are approved to be on your networks. Block everything else. Your organization should especially target systems that are internet facing as well as Human-Machine Interface (HMI) systems that play the critical role of managing generation, transmission, or distribution of electricity
  • Regularly patch software and operating systems. Implement a monthly practice to apply security patches to software on all your systems. This includes applications and Operating Systems on servers, desktop computers, mobile devices, network devices (routers, switches, firewalls, etc.), as well as Internet of Thing (IoT) and Industrial Internet of Thing (IIoT) devices. Attackers frequently target known security vulnerabilities.
  • Protect legacy systems. Segment legacy systems that can no longer be patched by using firewalls to filter out unnecessary traffic. Limit access to only those who need it by using Just In Time and Just Enough Access principles and requiring MFA. Once you set up these subnets, firewalls, and firewall rules to protect the isolated systems, you must continually audit and test these controls for inadvertent changes, and validate with penetration testing and red teaming to identify rogue bridging endpoint and design/implementation weaknesses.
  • Segment your networks. If you are attacked, it’s important to limit the damage. By segmenting your network, you make it harder for an attacker to compromise more than one critical site. Maintain your corporate network on its own network with limited to no connection to critical sites like generation and transmission networks. Run each generating site on its own network with no connection to other generating sites. This will ensure that should a generating site become compromised, attackers can’t easily traverse to other sites and have a greater impact.
  • Turn off all unnecessary services. Confirm that none of your software has automatically enabled a service you don’t need. You may also discover that there are services running that you no longer use. If the business doesn’t need a service, turn it off.
  • Deploy threat protection solutions. Services like Microsoft Threat Protection help you automatically detect, respond to, and correlate incidents across domains.
  • Implement an incident response plan: When an attack happens, you need to respond quickly to reduce the damage and get your organization back up and running. Refer to Microsoft’s Incident Response Reference Guide for more details.

Speak with one voice

Power grids are interconnected systems of generating plants, wires, transformers, and substations. Regional electrical companies work together to efficiently balance the supply and demand for electricity across the nation. These same organizations have also come together to protect the grid from attack. As an industry, working through organizations like the Edison Electric Institute (EEI), utilities can define security standards and hold manufacturers accountable to those requirements.

It may also be useful to work with The Federal Energy Regulatory Committee (FERC), The North American Electric Reliability Corporation (NERC), or The United States Nuclear Regulatory Commission (U.S. NRC) to better regulate the security requirements of products manufactured for the electrical grid.

Apply extra scrutiny to IoT devices

As you purchase and deploy IoT devices, prioritize security. Be careful about purchasing products from countries that are motivated to infiltrate critical infrastructure. Conduct penetration tests against all new IoT and IIoT devices before you connect them to the network. When you place sensors on the grid, you’ll need to protect them from both cyberattacks and physical attacks. Make them hard to reach and tamper-proof.

Collaborate on solutions

Reducing the risk of a destabilizing power grid attack will require everyone in the utility industry to play a role. By working with manufacturers, trade organizations, and governments, electricity organizations can lead the effort to improve security across the industry. For utilities in the United States, several public-private programs are in place to enhance the utility industry capabilities to defend its infrastructure and respond to threats:

Read Part 1 in the series: “Defending the power grid against cyberattacks

Read “Defending the power grid against supply chain attacks: Part 2 – Securing hardware and software

Read how Microsoft Threat Protection can help you better secure your endpoints.

Learn how MSRC developed an incident response plan

Bookmark the Security blog to keep up with our expert coverage on security matters. For more information about our security solutions visit our website. Also, follow us at @MSFTSecurity for the latest news and updates on cybersecurity.

The post Defending the power grid against supply chain attacks: Part 3 – Risk management strategies for the utilities industry appeared first on Microsoft Security.

Full Operational Shutdown—another cybercrime case from the Microsoft Detection and Response Team

April 2nd, 2020 No comments

Recently, we published our first case report (001: …And Then There Were Six) by the Microsoft Detection and Response Team (DART). We received significant positive response from our customers and colleagues and our team has been getting inquiries asking for more reports. We are glad to share the DART Case Report 002: Full Operational Shutdown.

In the report 002, we cover an actual incident response engagement where a polymorphic malware spread through the entire network of an organization. After a phishing email delivered Emotet, a polymorphic virus that propagates via network shares and legacy protocols, the virus shut down the organization’s core services. The virus avoided detection by antivirus solutions through regular updates from an attacker-controlled command-and-control (C2) infrastructure, and spread through the company’s systems, causing network outages and shutting down essential services for nearly a week. In our report, you can read the details of the attack and how DART responded, review the attack lateral progression diagram and learn best practices from DART experts.

Stay tuned for more DART case reports where you’ll find unique stories from our team’s engagements around the globe. As always, you can reach out to your Microsoft account manager or Premier Support contact for more information on DART services.

 

DART provides the most complete and thorough investigations by leveraging a combination of proprietary tools and Microsoft Security products, close connections with internal Microsoft threat intelligence and product groups, as well as strategic partnerships with security organizations around the world.

The post Full Operational Shutdown—another cybercrime case from the Microsoft Detection and Response Team appeared first on Microsoft Security.

Real-life cybercrime stories from DART, the Microsoft Detection and Response Team

March 9th, 2020 No comments

When we published our first blog about the Microsoft Detection and Response Team (DART) in March of 2019, we described our mission as responding to compromises and helping our customers become cyber-resilient. In pursuit of this mission we had already been providing onsite reactive incident response and remote proactive investigations to our customers long before our blog. And our response expertise has been leveraged many times by government and commercial entities around the world to help secure their most sensitive, critical environments.

When our team works on the frontlines of cybersecurity, chasing adversaries in many different digital estates on a daily basis, our experiences become valuable lessons on attacker methods as well as security best practices. And because of this, our colleagues and customers have been asking for case studies, reports, and even anecdotes from DART engagements.

Finally, we can respond to these inquiries by publishing our first DART Case Report 001: …And Then There Were Six. Case Report 001 is a story of cybercrime when DART was called in to help identify and evict an attacker, only to discover there were already 5 more adversaries in the same environment. Read the full report for the details.

In the DART Case Reports, you will find unique stories from our team’s engagements around the globe; details on the attacker(s) methods, a diagram of how they progressed in the environment, how DART was able to identify and evict them, as well as best practices to avoid similar incidents.

What you won’t find is any information about our customers, or their defenses, because in our reports we will focus solely on the attacker Tactics, Techniques, and Procedures (TTP) and how to defend against them. Read our first report, reach out to your Microsoft account manager or Premier Support contact if you need more information on DART services—and stay tuned for more DART Case Reports.

 

DART leverages Microsoft’s strategic partnerships with security organizations around the world and with internal Microsoft product groups to provide the most complete and thorough investigation possible.

The post Real-life cybercrime stories from DART, the Microsoft Detection and Response Team appeared first on Microsoft Security.

Real-life cybercrime stories from DART, the Microsoft Detection and Response Team

March 9th, 2020 No comments

When we published our first blog about the Microsoft Detection and Response Team (DART) in March of 2019, we described our mission as responding to compromises and helping our customers become cyber-resilient. In pursuit of this mission we had already been providing onsite reactive incident response and remote proactive investigations to our customers long before our blog. And our response expertise has been leveraged many times by government and commercial entities around the world to help secure their most sensitive, critical environments.

When our team works on the frontlines of cybersecurity, chasing adversaries in many different digital estates on a daily basis, our experiences become valuable lessons on attacker methods as well as security best practices. And because of this, our colleagues and customers have been asking for case studies, reports, and even anecdotes from DART engagements.

Finally, we can respond to these inquiries by publishing our first DART Case Report 001: …And Then There Were Six. Case Report 001 is a story of cybercrime when DART was called in to help identify and evict an attacker, only to discover there were already 5 more adversaries in the same environment. Read the full report for the details.

In the DART Case Reports, you will find unique stories from our team’s engagements around the globe; details on the attacker(s) methods, a diagram of how they progressed in the environment, how DART was able to identify and evict them, as well as best practices to avoid similar incidents.

What you won’t find is any information about our customers, or their defenses, because in our reports we will focus solely on the attacker Tactics, Techniques, and Procedures (TTP) and how to defend against them. Read our first report, reach out to your Microsoft account manager or Premier Support contact if you need more information on DART services—and stay tuned for more DART Case Reports.

 

DART leverages Microsoft’s strategic partnerships with security organizations around the world and with internal Microsoft product groups to provide the most complete and thorough investigation possible.

The post Real-life cybercrime stories from DART, the Microsoft Detection and Response Team appeared first on Microsoft Security.

Ghost in the shell: Investigating web shell attacks

February 4th, 2020 No comments

Recently, an organization in the public sector discovered that one of their internet-facing servers was misconfigured and allowed attackers to upload a web shell, which let the adversaries gain a foothold for further compromise. The organization enlisted the services of Microsoft’s Detection and Response Team (DART) to conduct a full incident response and remediate the threat before it could cause further damage.

DART’s investigation showed that the attackers uploaded a web shell in multiple folders on the web server, leading to the subsequent compromise of service accounts and domain admin accounts. This allowed the attackers to perform reconnaissance using net.exe, scan for additional target systems using nbstat.exe, and eventually move laterally using PsExec.

The attackers installed additional web shells on other systems, as well as a DLL backdoor on an Outlook Web Access (OWA) server. To persist on the server, the backdoor implant registered itself as a service or as an Exchange transport agent, which allowed it to access and intercept all incoming and outgoing emails, exposing sensitive information. The backdoor also performed additional discovery activities as well as downloaded other malware payloads. In addition, the attackers sent special emails that the DLL backdoor interpreted as commands.

Figure 1. Sample web shell attack chain

The case is one of increasingly more common incidents of web shell attacks affecting multiple organizations in various sectors. A web shell is a piece of malicious code, often written in typical web development programming languages (e.g., ASP, PHP, JSP), that attackers implant on web servers to provide remote access and code execution to server functions. Web shells allow adversaries to execute commands and to steal data from a web server or use the server as launch pad for further attacks against the affected organization.

With the use of web shells in cyberattacks on the rise, Microsoft’s DART, the Microsoft Defender ATP Research Team, and the Microsoft Threat Intelligence Center (MSTIC) have been working together to investigate and closely monitor this threat.

Web shell attacks in the current threat landscape

Multiple threat actors, including ZINC, KRYPTON, and GALLIUM, have been observed utilizing web shells in their campaigns. To implant web shells, adversaries take advantage of security gaps in internet-facing web servers, typically vulnerabilities in web applications, for example CVE-2019-0604 or CVE-2019-16759.

In our investigations into these types of attacks, we have seen web shells within files that attempt to hide or blend in by using names commonly used for legitimate files in web servers, for example:

  • index.aspx
  • fonts.aspx
  • css.aspx
  • global.aspx
  • default.php
  • function.php
  • Fileuploader.php
  • help.js
  • write.jsp
  • 31.jsp

Among web shells used by threat actors, the China Chopper web shell is one of the most widely used. One example is written in JSP:

We have seen this malicious JSP code within a specially crafted file uploaded to web servers:

Figure 2. Specially crafted image file with malicious JSP code

Another China Chopper variant is written in PHP:

Meanwhile, the KRYPTON group uses a bespoke web shell written in C# within an ASP.NET page:

Figure 3. Web shell written in C# within an ASP.NET page

Once a web shell is successfully inserted into a web server, it can allow remote attackers to perform various tasks on the web server. Web shells can steal data, perpetrate watering hole attacks, and run other malicious commands for further compromise.

Web shell attacks have affected a wide range of industries. The organization in the public sector mentioned above represents one of the most common targeted sectors.

Aside from exploiting vulnerabilities in web applications or web servers, attackers take advantage of other weaknesses in internet-facing servers. These include the lack of the latest security updates, antivirus tools, network protection, proper security configuration, and informed security monitoring. Interestingly, we observed that attacks usually occur on weekends or during off-hours, when attacks are likely not immediately spotted and responded to.

Unfortunately, these gaps appear to be widespread, given that every month, Microsoft Defender Advanced Threat Protection (ATP) detects an average of 77,000 web shell and related artifacts on an average of 46,000 distinct machines.

Figure 3: Web shell encounters 

Detecting and mitigating web shell attacks

Because web shells are a multi-faceted threat, enterprises should build comprehensive defenses for multiple attack surfaces. Microsoft Threat Protection provides unified protection for identities, endpoints, email and data, apps, and infrastructure. Through signal-sharing across Microsoft services, customers can leverage Microsoft’s industry-leading optics and security technologies to combat web shells and other threats.

Gaining visibility into internet-facing servers is key to detecting and addressing the threat of web shells. The installation of web shells can be detected by monitoring web application directories for web script file writes. Applications such as Outlook Web Access (OWA) rarely change after they have been installed and script writes to these application directories should be treated as suspicious.

After installation, web shell activity can be detected by analyzing processes created by the Internet Information Services (IIS) process w3wp.exe. Sequences of processes that are associated with reconnaissance activity such as those identified in the alert screenshot (net.exe, ping.exe, systeminfo.exe, and hostname.exe) should be treated with suspicion. Web applications such as OWA run from well-defined Application Pools. Any cmd.exe process execution by w3wp.exe running from an application pool that doesn’t typically execute processes such as ‘MSExchangeOWAAppPool’ should be treated as unusual and regarded as potentially malicious.

Microsoft Defender ATP exposes these behaviors that indicate web shell installation and post-compromise activity by analyzing script file writes and process executions. When alerted of these activities, security operations teams can then use the rich capabilities in Microsoft Defender ATP to investigate and resolve web shell attacks.

Figure 4. Sample Microsoft Defender ATP alerts related to web shell attacks

Figure 5. Microsoft Defender ATP alert process tree

As in most security issues, prevention is critical. Organizations can harden systems against web shell attacks by taking these preventive steps:

  • Identify and remediate vulnerabilities or misconfigurations in web applications and web servers. Deploy latest security updates as soon as they become available.
  • Audit and review logs from web servers frequently. Be aware of all systems you expose directly to the internet.
  • Utilize the Windows Defender Firewall, intrusion prevention devices, and your network firewall to prevent command-and-control server communication among endpoints whenever possible. This limits lateral movement as well as other attack activities.
  • Check your perimeter firewall and proxy to restrict unnecessary access to services, including access to services through non-standard ports.
  • Enable cloud-delivered protection to get the latest defenses against new and emerging threats.
  • Educate end users about preventing malware infections. Encourage end users to practice good credential hygiene—limit the use of accounts with local or domain admin privileges.

 

 

Detection and Response Team (DART)

Microsoft Defender ATP Research Team

Microsoft Threat Intelligence Center (MSTIC)

 

The post Ghost in the shell: Investigating web shell attacks appeared first on Microsoft Security.

Threat hunting in Azure Advanced Threat Protection (ATP)

January 7th, 2020 No comments

As members of Microsoft’s Detection and Response Team (DART), we’ve seen a significant increase in adversaries “living off the land” and using compromised account credentials for malicious purposes. From an investigation standpoint, tracking adversaries using this method is quite difficult as you need to sift through the data to determine whether the activities are being performed by the legitimate user or a bad actor. Credentials can be harvested in numerous ways, including phishing campaigns, Mimikatz, and key loggers.

Recently, DART was called into an engagement where the adversary had a foothold within the on-premises network, which had been gained through compromising cloud credentials. Once the adversary had the credentials, they began their reconnaissance on the network by searching for documents about VPN remote access and other access methods stored on a user’s SharePoint and OneDrive. After the adversary was able to access the network through the company’s VPN, they moved laterally throughout the environment using legitimate user credentials harvested during a phishing campaign.

Once our team was able to determine the initially compromised accounts, we were able to begin the process of tracking the adversary within the on-premises systems. Looking at the initial VPN logs, we identified the starting point for our investigation. Typically, in this kind of investigation, your team would need to dive deeper into individual machine event logs, looking for remote access activities and movements, as well as looking at any domain controller logs that could help highlight the credentials used by the attacker(s).

Luckily for us, this customer had deployed Azure Advanced Threat Protection (ATP) prior to the incident. By having Azure ATP operational prior to an incident, the software had already normalized authentication and identity transactions within the customer network. DART began querying the suspected compromised credentials within Azure ATP, which provided us with a broad swath of authentication-related activities on the network and helped us build an initial timeline of events and activities performed by the adversary, including:

  • Interactive logins (Kerberos and NTLM)
  • Credential validation
  • Resource access
  • SAMR queries
  • DNS queries
  • WMI Remote Code Execution (RCE)
  • Lateral Movement Paths

Azure Advanced Threat Protection

Detect and investigate advanced attacks on-premises and in the cloud.

Get started

This data enabled the team to perform more in-depth analysis on both user and machine level logs for the systems the adversary-controlled account touched. Azure ATP’s ability to identify and investigate suspicious user activities and advanced attack techniques throughout the cyber kill chain enabled our team to completely track the adversary’s movements in less than a day. Without Azure ATP, investigating this incident could have taken weeks—or even months—since the data sources don’t often exist to make this type of rapid response and investigation possible.

Once we were able to track the user throughout the environment, we were able to correlate that data with Microsoft Defender ATP to gain an understanding of the tools used by the adversary throughout their journey. Using the right tools for the job allowed DART to jump start the investigation; identify the compromised accounts, compromised systems, other systems at risk, and the tools being used by the adversaries; and provide the customer with the needed information to recover from the incident faster and get back to business.

Learn more and keep updated

Learn more about how DART helps customers respond to compromises and become cyber-resilient. Bookmark the Security blog to keep up with our expert coverage on security matters. Also, follow us at @MSFTSecurity for the latest news and updates on cybersecurity.

The post Threat hunting in Azure Advanced Threat Protection (ATP) appeared first on Microsoft Security.

Ransomware response—to pay or not to pay?

December 16th, 2019 No comments

The increased connectivity of computers and the growth of Bring Your Own Device (BYOD) in most organizations is making the distribution of malicious software (malware) easier. Unlike other types of malicious programs that may usually go undetected for a longer period, a ransomware attack is usually experienced immediately, and its impact on information technology infrastructure is often irreversible.

As part of Microsoft’s Detection and Response Team (DART) Incident Response engagements, we regularly get asked by customers about “paying the ransom” following a ransomware attack. Unfortunately, this situation often leaves most customers with limited options, depending on the business continuity and disaster recovery plans they have in place.

The two most common options are either to pay the ransom (with the hopes that the decryption key obtained from the malicious actors works as advertised) or switch gears to a disaster recovery mode, restoring systems to a known good state.

The unfortunate truth about most organizations is that they are often only left with the only option of paying the ransom, as the option to rebuild is taken off the table by lack of known good backups or because the ransomware also encrypted the known good backups. Moreover, a growing list of municipalities around the U.S. has seen their critical infrastructure, as well as their backups, targeted by ransomware, a move by threat actors to better guarantee a payday.

We never encourage a ransomware victim to pay any form of ransom demand. Paying a ransom is often expensive, dangerous, and only refuels the attackers’ capacity to continue their operations; bottom line, this equates to a proverbial pat on the back for the attackers. The most important thing to note is that paying cybercriminals to get a ransomware decryption key provides no guarantee that your encrypted data will be restored.

So, what options do we recommend? The fact remains that every organization should treat a cybersecurity incident as a matter of when it will happen and not whether it will happen. Having this mindset helps an organization react quickly and effectively to such incidents when they happen. Two major industry standard frameworks, the Sysadmin, Audit, Network, and Security (SANS) and the National Institute of Standards and Technology (NIST), both have published similar concepts on responding to malware and cybersecurity incidents. The bottom line is that every organization needs to be able to plan, prepare, respond, and recover when faced with a ransomware attack.

Outlined below are steps designed to help organizations better plan and prepare to respond to ransomware and major cyber incidents.

How to plan and prepare to respond to ransomware

1. Use an effective email filtering solution

According to the Microsoft Security Intelligence Report Volume 24 of 2018, spam and phishing emails are still the most common delivery method for ransomware infections. To effectively stop ransomware at its entry point, every organization needs to adopt an email security service that ensures all email content and headers entering and leaving the organization are scanned for spam, viruses, and other advanced malware threats. By adopting an enterprise-grade email protection solution, most cybersecurity threats against an organization will be blocked at ingress and egress.

2. Regular hardware and software systems patching and effective vulnerability management

Many organizations are still failing to adopt one of the age-old cybersecurity recommendations and important defenses against cybersecurity attacks—applying security updates and patches as soon as the software vendors release them. A prominent example of this failure was the WannaCry ransomware events in 2017, one of the largest global cybersecurity attacks in the history of the internet, which used a leaked vulnerability in Windows networking Server Message Block (SMB) protocol, for which Microsoft had released a patch nearly two months before the first publicized incident. Regular patching and an effective vulnerability management program are important measures to defend against ransomware and other forms of malware and are steps in the right direction to ensure every organization does not become a victim of ransomware.

3. Use up-to-date antivirus and an endpoint detection and response (EDR) solution

While owning an antivirus solution alone does not ensure adequate protection against viruses and other advanced computer threats, it’s very important to ensure antivirus solutions are kept up to date with their software vendors. Attackers invest heavily in the creation of new viruses and exploits, while vendors are left playing catch-up by releasing daily updates to their antivirus database engines. Complementary to owning and updating an antivirus solution is the use of EDR solutions that collect and store large volumes of data from endpoints and provide real-time host-based, file-level monitoring and visibility to systems. The data sets and alerts generated by this solution can help to stop advanced threats and are often leveraged for responding to security incidents.

4. Separate administrative and privileged credentials from standard credentials

Working as a cybersecurity consultant, one of the first recommendations I usually provide to customers is to separate their system administrative accounts from their standard user accounts and to ensure those administrative accounts are not useable across multiple systems. Separating these privileged accounts not only enforces proper access control but also ensures that a compromise of a single account doesn’t lead to the compromise of the entire IT infrastructure. Additionally, using Multi-Factor Authentication (MFA), Privileged Identity Management (PIM), and Privileged Access Management (PAM) solutions are ways to effectively combat privileged account abuse and a strategic way of reducing the credential attack surface.

5. Implement an effective application whitelisting program

It’s very important as part of a ransomware prevention strategy to restrict the applications that can run within an IT infrastructure. Application whitelisting ensures only applications that have been tested and approved by an organization can run on the systems within the infrastructure. While this can be tedious and presents several IT administrative challenges, this strategy has been proven effective.

6. Regularly back up critical systems and files

The ability to recover to a known good state is the most critical strategy of any information security incident plan, especially ransomware. Therefore, to ensure the success of this process, an organization must validate that all its critical systems, applications, and files are regularly backed up and that those backups are regularly tested to ensure they are recoverable. Ransomware is known to encrypt or destroy any file it comes across, and it can often make them unrecoverable; consequently, it’s of utmost importance that all impacted files can be easily recovered from a good backup stored at a secondary location not impacted by the ransomware attack.

Learn more and keep updated

Learn more about how DART helps customers respond to compromises and become cyber-resilient. Bookmark the Security blog to keep up with our expert coverage on security matters. Also, follow us at @MSFTSecurity for the latest news and updates on cybersecurity.

The post Ransomware response—to pay or not to pay? appeared first on Microsoft Security.

Changing security incident response by utilizing the power of the cloud—DART tools, techniques, and procedures: part 1

November 14th, 2019 No comments

This is the first in a blog series discussing the tools, techniques, and procedures that the Microsoft Detection and Response Team (DART) use to investigate cybersecurity incidents at our customer organizations. Today, we introduce the team and give a brief overview of each of the tools that utilize the power of the cloud. In upcoming posts, we’ll cover each tool in-depth and elaborate on techniques and procedures used by the team.

Key lessons learned from DART’s investigation evolution

DART’s investigation procedures and technology have evolved over 14 years of assisting our customers during some of the worst hack attacks on record. Tools have evolved from primarily bespoke (custom) tools into a blend of commercially available Microsoft detection solutions plus bespoke tools, most of which extend the core Microsoft detection capabilities. The team contributes knowledge and technology back to the product groups, who leverage that experience into our products, so our customers can benefit from our (hard-won) lessons learned during our investigations.

This experience means that DART’s tooling and communication requirements during incident investigations tend to be a bit more demanding than most in-house teams, given we’re often working with complex global environments. It’s not uncommon that an organization’s ability to detect and respond to security incidents is inadequate to cope with skilled attackers who will spend days and weeks profiling the organization and its employees. Consequently, we help organizations across many different industry verticals and from those experiences we have collated some key lessons:

  • Detection is critical (and weak)—One of the first priorities when the team engages to assist with an incident investigation at a customer site is to increase the detection capability of that organization. Over the years, we’ve seen that industry-wide detection has stayed the weakest of the Protect, Detect, Respond triad. While the average dwell time numbers are trending downward, it’s still measured in days (usually double digit numbers) and days of access to your systems is plenty of time to do massive damage.
  • Inadequate auditing—More often than not, DART finds that organizations don’t turn on auditing or have misconfigured auditing with the result that there is not a full record of attacker activities. See auditing best practices for Active Directory and Office 365. In addition, given the current prolific use of weaponized PowerShell scripts by attackers, we strongly recommend implementing PowerShell auditing.
  • Static plus active containment—Static containment (protection) controls can never be 100 percent successful against skilled human attackers, so we need to add in an active containment component that can detect and contain those attackers at the edge and as they move around the environment. This second part is crucial—as they move around the environment—we need to move away from the traditional mindset of “Time to Detect” and implement a “Time to Remediate” approach with active containment procedures to disrupt attackers’ abilities to realize their objective once in the environment. Of course, attackers that have been in the organization for a very long time require more involved investigation and planning for an eviction event to be successful and lessen any potential impact to the organization.

These lessons have significantly influenced the methodology and toolsets we use in DART as we engage with our customers. In this blog series, we’ll share lessons learned and best practices of organizations and incident responders to help ensure readiness.

Observe-Orient-Decide-Act (OODA) framework

Before we can act in any meaningful way, we need to observe attacker activities, so we can orient ourselves and decide what to do. Orientation is the most critical step in the Observe-Orient-Decide-Act (OODA) framework developed by John Boyd and overviewed in this OODA article. Wherever possible, the team will light up several tools in the organization, installing the Microsoft Management Agent (MMA) and trial versions of the Microsoft Threat Protection suite, which includes Microsoft Defender ATP, Azure ATP, Office 365 ATP, and Microsoft Cloud App Security (our Cloud Access Security Broker (CASB) solution named illustrated in Figure 1). Why? Because these technologies were developed specifically to form an end-to-end picture across the attacker cyber kill-chain framework (reference Lockheed Martin) and together work swiftly to gather indicators of anomaly, attack, and compromise necessary for successful blocking of the attacker.

The Microsoft ATP platform of tools are used extensively by the Microsoft Corporate IT security operations center (SOC) in our Cyber Defence Operations Center (CDOC), whose slogan is “Minutes Matter.” Using these technologies, the CDOC has dropped their time to remediate incidents from hours to minutes—a game changer we’ve replicated at many of our customers.

Microsoft Threat Protection

The Microsoft Threat Protection platform includes Microsoft Defender ATP, Azure ATP, Office 365 ATP, as well as additional services that strengthen security for specific attack vectors, while adding security for attack vectors that would not be covered by the ATP solutions alone. Read Announcing Microsoft Threat Protection for more information. In this blog, we focus on the tools that give DART a high return on investment in terms of speed to implement versus visibility gained.

Infographic showing maximum detection during attack stages, with Office 365 ATP, Azure AD Identity Protection, and Cloud App Security.

Figure 1. Microsoft Threat Protection and the cyber kill-chain.

Although the blog series discusses Microsoft technologies preferentially, the intent here is not to replicate data or signals—the team uses what the customer has—but to close gaps where the organization might be missing signal. With that in mind, let’s move on to a brief discussion of the tools.

Horizontal tools: Visibility across the cyber kill-chain

Horizonal tools include Azure Sentinel and Azure Security Center:

  • Azure Sentinel—New to DART’s arsenal is Azure Sentinel—the first cloud-native SIEM (security investigation and event management). Over the past few months, DART has deployed Azure Sentinel as a mechanism to combine the different signal sets in what we refer to as a SIEM and SOAR as a service. SOAR, which stands for security orchestration and automation, is indispensable in its capability to respond to attacker actions with speed and accuracy. Our intention is not to replicate a customer SIEM but to use the power of the cloud and machine learning to quickly combine alerts across the cyber kill-chain in a fusion model to lessen the time it takes an investigator to understand what the attacker is doing.

Importantly, machine learning gives DART the ability to aggregate diverse signals and get an end-to-end picture of what is going on quickly and to act on that information. In this way, information important to the investigation can be forwarded to the existing SIEM, allowing for efficient and speedy analysis utilizing the power of the cloud.

  • Azure Security Center—DART also onboards the organization into Azure Security Center, if not already enabled for the organization. This tool significantly adds to our ability to investigate and pivot across the infrastructure, especially given the fact that many organizations don’t yet have Windows 10 devices deployed throughout. Security Center also does much more with machine learning for next-generation detection and simplifying security management across clouds and platforms (Windows/Linux).

DART’s focus for the tool is primarily on the log analytics capabilities that allow us to pivot our investigation and, furthermore, utilize the recommended hardening suggestions during our rapid recovery work. We also recommend the implementation of Security Center proactively, as it gives clear security recommendations that an organization can implement to secure their on-premises and cloud infrastructures. See Azure Security Center FAQs for more information.

Vertical tools: Depth visibility in designated areas of the cyber kill-chain

Vertical tools include Azure ATP, Office 365 ATP, Microsoft Defender ATP, Cloud App Security, and custom tooling:

  • Azure ATP—The Verizon Data Breach Report of 2018 reported that 81 percent of breaches are caused by compromised credentials. Every incident that DART has responded to over the last few years has had some component of credential theft; consequently Azure ATP is one of the first tools we implement when we get to a site—before, if possible—to get insight into what users and entities are doing in the environment. This allows us to utilize built-in detections to determine suspicious behaviour, such as suspicious changes of identity metadata and user privileges.
  • Office 365 ATP—With approximately 90 percent of all attacks starting with a phishing email, having ways to detect when a phishing email makes it past email perimeter defences is critical. DART investigators are always interested in which mechanism the attacker compromised the environment—simply so we can be sure to block that vector. We use Office 365 ATP capabilities— such as security playbooks and investigation graphs—to investigate and remediate attacks faster.
  • Microsoft Defender ATP—If the organization has Windows 10 devices, we can implement Microsoft Defender ATP (previously Windows Defender ATP)—a cloud-based solution that leverages a built-in agent in Windows 10. Otherwise, we’ll utilize MMA to gather information from older versions of Windows and Linux machines and pull that information into our investigation. This makes it possible to detect attacker activities, aggregate this information, and prioritize the investigation of detected activity.
  • Cloud App SecurityCloud App Security is a multi-mode cloud access security broker that natively integrates with the other tools DART deploys, giving access to sophisticated analytics to identify and combat cyberthreats across the organizations. This allows us to detect any malicious activity using cloud resources that the attacker might be undertaking. Cloud App Security, combined with Azure ATP, allows us to see if the attacker is exfiltrating data from the organization, and also allows organizations to proactively determine and assess any shadow IT they may be unaware of.
  • Custom tooling—Bespoke custom tooling is deployed depending on attacker activities and the software present in the organization. Examples include infrastructure health-check tools, which allow us to check for any modification of Microsoft technologies—such as Active Directory, Microsoft’s public key infrastructure (PKI), and Exchange health (where Office 365 is not in use) as well as tools designed to detect use of specific specialist attack vectors and persistence mechanisms. Where machines are in frame for a deeper investigation, we normally utilize a tool that runs against a live machine to acquire more information about that machine, or even run a full disk acquisition forensic tool, depending on legal requirements.

Together, the vertical tools give us unparalleled view into what is happening in the organization. These signals can be collated and aggregated into both Security Center and Azure Sentinel, where we can pull other data sources as available to the organization’s SOC.

Figure 2 represents how we correlate the signal and utilize machine learning to quickly identify compromised entities inside the organization.

Infographic showing combined signals: Identity, Cloud Apps, Data, and Devices.

Figure 2. Combining signals to identify compromised users and devices.

This gives us a very swift way to bubble up anomalous activity and allows us to rapidly orient ourselves against attacker activity. In many cases, we can then use automated playbooks to block attacker activity once we understand the attacker’s tools, techniques, and procedures; but that will be the subject of another post.

Next up—how Azure Sentinel helps DART

Today, in Part 1 of our blog series, we introduced the suite of tools used by DART and the Microsoft CDOC to rapidly detect attacker activity and actions—because in the case of cyber incident investigations, minutes matter. In our next blog we’ll drill down into Azure Sentinel capabilities to highlight how it helps DART; stay posted!

Azure Sentinel

Intelligent security analytics for your entire enterprise.


Learn more

Bookmark the Security blog to keep up with our expert coverage on security matters. Also, follow us at @MSFTSecurity for the latest news and updates on cybersecurity.

The post Changing security incident response by utilizing the power of the cloud—DART tools, techniques, and procedures: part 1 appeared first on Microsoft Security.