Skip to main content

Finding PII Data in Splunk Logs

 What is PII?

Personally Identifiable Information (PII) is the data that could identify a specific person and identity.

What is included in PII?

It includes Personally Identifiable Information, it varies according to your country, but usually include the following:

  • Mobile number
  • Phone number
  • Physical Address
  • Email address
  • Aadhaar number
  • Pan number
  • Salary amount
  • Social security number
  • National ID number
  • Session cookies
  • Username
  • Password

How to find PII data in Splunk logs??

To find the PII data start with a basic query like index=test “*@gmail.com”. You will get the output with the Gmail IDs, now start finding the variable names that the company is using to define the PII data like “emailAddress”, “Phonenumber”, etc from the output. In the same way, we have to search for different variable names that are used in the logs for defining the PII data.

When we were searching for the PII data, we have found some variable names that companies are mostly using for defining the PII data which are mentioned below. You can use that variable names to craft the query for finding the PII data.

How to find all service names and loggers associated with the service names?

Add email ids like below, you will get all the services and logger for the mentioned email ids:
index=test "@gmail.com" OR "@hotmail.com" OR "@outlook.com" | stats count by a, logger

Mostly used variable names in companies:

  • email address
  • ZipCode
  • Address
  • Order amount
  • sessionid
  • secureData
  • hash
  • webOrderNumber
  • customerSessionId
  • authKey
  • deliveryAddress
  • eveningPhone
  • postalCode
  • firstName
  • lastName
  • mobilePhone
  • street
  • email
  • state
  • cardType
  • cardInteger
  • cardBinNumber
  • cardExpirationMonth
  • DOB
  • Birth
  • personId
  • businessAddress
  • shippingAddress
  • sessionId
  • billingAddress
  • checkoutSessionId
  • sessionIdentifier
  • cookie
  • bankAccountNumber
  • taxPayerId

Note: All the variable names are not PII data but it helps you to find the data in the logs.

Examples of Splunk PII data dorks:

  • index=test “*@gmail.com”
  • index=test “*@gmail.com” AND a=”ServiceName”
  • index=test “* @ *.com” (Remove space while copying between both *)
  • index=test “billingAddress”
  • index=test “billingAddress” AND a=”ServiceName”
  • index=test “billingAddress” AND a=”ServiceName” AND a!=”Servicename1" [a!=”Servicename1" → This define that don’t search for service Name1]
  • index=test “mobilePhone”
  • index=test “@gmail.com” OR “@hotmail.com” OR “@outlook.com” OR “@hotmail.co.uk” OR “@yahoo.com”

Important points:

  • Whenever you search for the PII data, try to filter out the service name first and then filter out the PII data by service name. It helps for refining the results for the particular service. Example:
    Step 1: search index=test “Address” in the Splunk
    Step 2: On the left-hand side the “a” is mentioned, click on “a”.
    Step 3: All the service names will be visible now.
    Step 4: Now search PII data only for that service name. index=test “Address” AND a=”ServiceName”
  • The PII data should not be mentioned in the URL, craft a query in such a way so that the PII data should not be included in the URL that will be used in the report.
  • Refine the query by using unique values or variable names like index=test “emailAddress” “bankAccount” AND a=”Servicename” AND logger=”com.xxx.store.yyy.vvv”

Comments

Popular posts from this blog

Free Cybersecurity Certifications

Introduction to Cybersecurity Cybersecurity Essentials Networking Essentials Android Bug Bounty Hunting: Hunt Like a Rat Ethical Hacking Essentials (EHE) Digital Forensics Essentials (DFE) Network Defense Essentials (NDE) Introduction to Dark Web, Anonymity, and Cryptocurrency AWS Skill Builder Introduction to Cybersecurity Building a Cybersecurity Toolkit Cyber Aces Free Cyber Security Training Course Introduction to Information Security Penetration Testing - Discovering Vulnerabilities

Is your webcam exposed on the internet and everyone enjoying your personal moments? | How to check webcam or security camera is exposed on the internet or not?

Nowadays we start using many technology devices in our homes. Many people are installing CCTV or security cameras in their houses, private rooms, offices, private places, etc for security purposes and monitoring, but many of them don't know how to configure that device securely. So let's talk about CCTV and security cameras only.  What do most CCTV/Security camera users believe? Most users believe that using a strong username and password on a camera administrative page protects them. (Partially true in the case of online cameras) Example: Why it is partially true? It's partially true because you are protecting only the camera administrative page which is also an important part. Still, you are not protecting the protocol used to control streaming media servers (Real-Time Streaming Protocol ( RTSP )). I have seen many online webcams whose administrative page is secured by strong credentials, but they forget to secure the RTSP protocol which gives me access to the streaming

Web Application Security Testing (WAPT) Interview Questions

Let's Contribute All Together For Creating a Questions Dump What are the vulnerabilities you have to test in the Login form, Payment gateway? What is clickjacking? What is the mitigation of clickjacking? What is CSRF? How to mitigate CSRF? Let's take an example, If a developer implements a CSRF token in a cookie, will it mitigate the CSRF issue? Is it possible to mitigate the CSRF by header? If yes why, if No why? If the data is in JSON format, how you will check the CSRF issue and what are the ways of exploitation? Where to implement the CSRF token and why? If the client doesn't want to change the UI or doesn't want to implement the CSRF tokens, and headers then what mitigation you recommended to the client for CSRF? What is the problem with the per-request token? Is login CSRF possible? Explain login CSRF? Have you ever exploited it? What is the mitigation for login CSRF? Suppose, in an application csrf token is implemented in each request and every request, except th