avatarTeri Radichel

Summarize

Pointers and References

When your code points to security problems

One of my post that may later become a book on Secure Code. Also one of my posts on Application Security.

Free Content on Jobs in Cybersecurity | Sign up for the Email List

When I reviewed the blogs for my last book I made serious edits and added additional content in the book. I’m sure this will happen in these blog posts as well. I was grateful to have a security professional I knew review the book. He let me know where I had an egregious typo and had used the wrong word. I knew what I meant in my mind, but the word I used for what I was trying to explain was incorrect. Someone reading my word would have either learned the incorrect definition of the word or at least had a different understanding than I had intended.

You can think of a word like a pointer or reference. The word points to a definition. You will find technical distinctions between a pointers and a references in programming, but both point to something. When your reference or pointer points to the wrong thing, you may have a bug in your application. In some cases, it may also be a security problem. Attackers and unscrupulous programmers may try to manipulate pointers and references, guess them, obtain data, or abuse them in some of the other ways explained in this blog post.

Code without pointers

Let’s say you create a set of data that contains a list of customers. The data might look like this:

Customer First Name | Customer Last Name | Phone Number
Mickey                Mouse                111-222-3333
Donald                Duck                 222-333-4444
Minnie                Mouse                333-444-5555

You load this data into system memory and now you want to sort it. You create a function to sort the data. You pass all the data shown above to the function to sort your list. Then you pass all the sorted data back to the calling function after it is sorted.

Most companies will have more than three customers. Copying the entire customer list around in your application would be pretty inefficient. Your system would use a lot of memory and it could take a lot of time to copy a large list into your sorting function and return it again.

Not only would it be inefficient you would have to pass a copy of the data. Let’s say when you passed a copy of that data into the sorting function someone called another function that acts updates phone numbers. That function updates the Mickey’s phone number to 555–666–7777 in the original data. Your sort function now returns outdated information because it’s operating on stale data. I explain other problems when two different parts of a program operate on data at the same time in a post on concurrency. For now, let’s make this program more efficient.

Instead of sending the entire customer list to functions that need to operate on it, our application can pass the location of the data. When a function receives the location it can operate on the data where it is located, rather than receiving the data and passing it back again. That’s how pointers and references work. You pass information to a function that tells it where to find the data instead of passing the data itself.

By operating on data this way, applications don’t need to store as much data in memory or on disk. Additionally, each time the application needs to retrieve and store data requires additional processing. The amount of processing time and resources will be reduced when using a reference to the data instead of copying the data itself.

Examples of pointers or references

There are many ways in which applications point to or reference something else. Here are some examples:

  • A pointer in C++ stores the memory location of another variable.
  • Object references in Java that point to an object.
  • Identity values in a database point to a particular record.
  • A website URL points to a particular website or application.
  • A domain name represents one or more IP addresses.
  • An AWS S3 bucket object key.
  • A session ID.

Pointers and references may return a memory address where the application can find the value of a variable or object. Then the application can retrieve the data from or act on it in that location. An identifier representing a user may represent a person or system allowed associated with a particular set of permission. An identifier in a database indicates where to find a particular set of data.

Pointing to memory

Think of the memory on your system like mailboxes at the post office. Each box has a number associated with it: 201, 202, 203. When someone comes to pick up their mail, they ask the clerk to give them the mail in mailbox 202. The clerk goes to get that mail out of box 202 and hands it to the person. 202 points to a mailbox that contains the mail (or the data). In a programming language, a pointer or reference can store a memory address on the system. A program would use that memory address to access data stored at that memory location.

In C++ you can declare a string variable and assign a value.

string dog= "Fido";

Print the output of your variable to the screen:

cout << "Our dog is " << dog << '\n';

You will see:

Our dog is Fido

Create a second variable that points to our dog variable. The value of this new variable will be an ampersand (&) followed by the name of the variable to which it points.

pointerToDog = &dog

When the system prints out the value of pointerToDoc or &dog it would get a memory address instead of ‘Our dog is Fido’. The application would use that memory location to obtain the value of the variable dog or act on it in some way. For example, the application might rename your dog by updating the value of that memory location to ‘Pluto.’

If you want to read more about pointers in C/C++ here’s a good reference.

This post doesn’t go into too much technical detail about what pointers and references are because the purpose of this content is not to tell you how to implement them. Rather, I want to explain the concept of pointers and references at a higher abstracted level and how attackers or unscrupulous programmers may abuse them. By understanding what can go wrong when something in your application points to something else, you can prevent these problems in any context.

Pointing to another data storage location

Pointers and references may point to other types of data storage locations as well. When you create a record in a database it often has an index or ID. Systems use that value to find the record in the database.

When you create a cloud object storage location such as an AWS S3 bucket, Azure storage account, or Google Cloud storage bucket, how do you find the data you store in it? You’ll use a key that identifies the object. That key points to or references the object so you can retrieve, update, or delete it later.

Pointing to applications and code

Many times a pointer will point to applications or code. In some cases your code may be tricked into using a the incorrect point to a particular piece of code. In other cases, the code you point to may change to something you’re not expecting. When you engineer systems consider all the ways this could happen an defend against it.

Here are some of the ways in which your systems may use references to other applications and external code:

  • A domain name points to one or more IP addresses. That domain name is often a pointer to some server, service, or application running on the Internet.
  • A URL is a more granular pointer that will point to a particular page within an application, and API, or JavaScript that executes in a browser when loaded.
  • Within an operating system, pointers tell programs which services to access, which OS APIs to call, and where these things are located.
  • Within an application, names of functions or methods indicate which method an application should call.

Many different types of attacks try to trick programs into calling code they should not. For example, in Java, you can have two classes with the same name in a program. When writing code to instantiate a class, how can you be sure the correct one gets loaded?

I had numerous related headaches with Spring, a Java framework, when it first came out. Spring would be pointing to a long list of libraries, sometimes with overlapping names or different cached versions. I don’t remember the exact problem but I had to fight with it to get it to use the class I intended.

This is one of the reasons I’m not always a fan of complicated frameworks. They are supposed to make things easier and save time. But sometimes, they are overly complicated and error-prone compared to writing my own code. Of course, my own code may take longer and I am a very experienced programmer. For some applications and organizations, it makes sense to use frameworks because they provide protections and make things easier even though they increase system complexity. The decision of when to use or not use a framework is a judgement call for the system architects and engineers.

Pointing to system users and permissions

System users are identified with session IDs or User names and passwords. These IDs represent a person or system with a particular set of permissions. Those IDs grant access to read data and take actions on systems.

When someone can access an ID and impersonate or act as another user, they can do things they shouldn’t be able to do. Protecting the identifiers that represent a particular user in all the places where it is used throughout a system is paramount.

I may have mentioned this in my prior book but when we were trying to identify a file in a particular system one of the engineers suggested we could name the file the session ID of the logged in user. Those files got downloaded to user machines and might be accessible to customer support personnel. I had to explain why that was not a good idea.

Pointers and references gone wrong

Here’s the essential problem with pointers and references using an analogy. Someone asks you who your sister is. She’s standing across the room and you point at her. Just as you do that, she moves. Now you are pointing at nothing. A moment later your brother steps into the spot where your sister was pointing. Now someone asked you who your sister was, but by the time they looked where you were pointing they could have seen either nothing or your brother. That pretty much sums up most problems with pointers in a few words.

On the other side of this pointer, imagine that you are swapped out with your cousin. Now your cousin is pointing at your sister. Your sister is not your cousin’s sister. The manipulation can occur on either side of the pointer. An attacker can manipulate the pointer the thing at which it points.

There are many ways in which programming errors lead to incorrect pointers. Attackers may try to trick your program into pointing to the wrong data or location. A programmer might make an error that allows an attacker to come along and abuse the pointer after the programmer is finished using it. Sometimes an attacker may be able to swap out what an ID or pointer references. Attackers may try to obtain references to access information they should not be able to access. Let’s look at some of these abuses in action.

As a software engineer, you need to ensure that your logic always correctly points to what you intend. You also need to defend against attackers and other programmers using your code or system who might try to change your pointers and references to something they shouldn’t. They may manipulate your pointer, or try to alter what it references. They may try to use pointers and references to get at data or take actions they should not be able to take.

Data swapping

Let’s say you purchase a product on an e-commerce website and got your receipt in an email that has pictures of what you purchased. However, when you log into the e-commerce website later and look at your order, your order shows the wrong products! How could this happen?

Back when I started creating e-commerce websites everything was brand new. There was no sample code. We didn’t know how to do all the things that seem so easy and obvious today. There are so many gadgets that help you create a product grid on a website that today’s programmers might say that is easy. Back then, I worked on a team of programmers trying to figure out how to build one from scratch. At one point, everyone told me to give up but, I figured it out after a lot of thinking about the problem. Those little numbers at the bottom of the screen — page 1, page 2, page — were somewhat of an accomplishment at the time.

One mistake I made early on was that I used a pointer to a product (an ID) in my order items table in a database. The receipt and orders pages customers would log into view after a purchase would go retrieve the description of each item from the product table. However, if someone changed the data in the product table to a completely different product for the same ID, the customer could see different products on their receipt or in their order history than those they intended to buy!

I quickly fixed that problem and did not make it again. I started to think about other places where data changes could cause a problem at a later point in time. Part of your software engineering design should consider when it is appropriate to copy data and when you should point to it. I’ll talk about how copying data can lead to the opposite problem when it comes to data integrity in the next post.

Instead of pointing to the ID in the database, I needed to make a copy of the product descriptions in my tables that store what the customer ordered. You may think this is an obvious problem. But I just experienced the same issue on a very large e-commerce website that people use to sell their products. When I logged in to report a problem with my order, I saw a different product! The seller had altered the description of the product. If I was returning a very expensive item, I might the wrong amount back in return.

A related problem happened when I received the wrong item. I tried to return it and I was told my return was rejected because I didn’t return the item I purchased. Luckily I explained the problem to the e-commerce merchant and ended up getting a full refund. In this case the problem may or may not have been related to code. I don’t know if a system in the warehouse instructed someone to ship me the wrong product or a person just made a mistake.

Think about how this problem could affect product reviews. I suspect this may actually be happening on some major e-commerce websites. A seller lists a product that gets 5-star reviews. At some point, the seller has a new product that has never been reviewed and has run out of the old product. Instead of creating a new product, the seller updates the existing product with 5-star reviews to the description of the new product. It could turn out that the new product is of subpar quality even though it shows up in product search results as a 5-star product.

Here’s another somewhat humorous example. Many people are very excited about blockchain which is the underlying technology used by Bitcoin and other cryptocurrencies. They claim that it is more secure since it is a decentralized platform and everything is encrypted. I’m not here to debate that either way. I’m just going to explain that not although a particular aspect of a technology may be amazing, not understanding the attack vectors, weaknesses, and how to use it properly can cause security problems.

One way to use blockchain is for smart contracts that track the details of a transaction. In theory, using blockchain will make these contracts more trustworthy and secure. One company started demonstrating how they were going to use smart contracts to sell art. A security researcher came along and claimed that he had painted the Mona Lisa. The smart contract proved it! What really happened is that he was able to swap out the actual art in the contract with anything he wanted, so he pointed the contract to the Mona Lisa.

Another potential concern I run across on penetration tests relates to web applications that allow customers to enter URLs that point to third-party websites. For example, an advertising service that allows customers to put URLs into the system to point to an ad hosted on another web server. The advertising service checks the ad hosted on the third-party web server to make sure it does not contain any malware. However, after approval, the customer can change the URL, or they could change the third-party webpage to include malicious data.

I wrote about the inclusion of third-party code on your website by loading code from remote URLs can lead to a similar problem. Injection is a type of attack where attackers try to insert code into your application and get it to execute. If you point to third-party APIs or domain names on your webpage you should be taking steps to ensure the data is what you expect. More on how to protect your application when you use third-party APIs in the next post on data integrity.

Guessing object references

If an attacker can guess the identifier for some data or an object that may provide unauthorized access to data. This is another mistake I made early on in my e-commerce career. I created a website that included the order ID in the URL. The order ID came from an ID in the database. The ID increased by one for each order.

In other words, someone placed an order and their order ID was 1000. The next customer would have order 1001. The customer after that would have order ID 1002, and so on. What’s the problem with that? That depends on how your application works.

Consider what would happen if I allowed customers to request pentesting services on my current website. Let’s say after a person submitted their order my website had a URL like this that displayed the order information.

https://2ndsightlab.com?orderid=1000

An attacker might try to guess other order IDs to try to see other people’s order data. If the IDs are incremental it’s easy for an attacker to increase their order by one and guess the next order number. Visiting the following URLs might show each person’s order in some cases:

https://2ndsightlab.com?orderid=1001
https://2ndsightlab.com?orderid=1002
https://2ndsightlab.com?orderid=1003

This type of security vulnerability is known as indirect object reference or (IDOR). One of the problems here is an easy-to-guess object reference (the order id). Instead of making IDs simple to calculate, you may want to randomize them with a secure programming library.

Web applications that leverage session IDs to identify a visitor can also have the same problem. An attacker who can guess the next session ID would be able to take any action the true owner of the valid session ID could take. They may be able to update or delete data or perform fraudulent banking transactions. Make sure your application properly randomizes session IDs, if you use them, and keep them secret!

Some randomization libraries have logic errors that lead to data that is not completely random or not random enough to meet various security standards. Do your homework when trying to create random data. For example in Java, you should know the difference between java.util.Random and java.security.SecureRandom and how to use a proper seed.

If your application server uses session IDs, hopefully, they randomize the values appropriately for you. Additionally, you may be able to use a function in a database to generate random ID values.

All that being said, using incremental numbers for rows in a database has some advantages when it comes to auditing. If you have a system that increments ID numbers you can see if something got deleted when you see a gap in the numbers. A bigger problem with our order ID example was not that the order IDs were not random. It was the fact that someone visiting the website could see someone else’s orders because they had permission to view any order ID. The bigger problem in that example had to do with authorization.

When someone accesses data in a web application, make sure that you check not only if they are allowed to access the system, but if they are allowed to view the data they are trying to access. That is a problem I find a lot while penetration testing. Sometimes programmers check to see if someone can log in. In each request they check if the user has a valid session ID or access token. However, they do not ensure that in all cases that the visitor using that session ID or access token has permission to view the data or take the action they are requesting.

The mechanism most websites use at the time of this writing to properly secure data is using Jason Web Tokens (JWTs). Once again, I’m not going to go into all the technical details, but from a security perspective, make sure that you not only check to see if the person is allowed to log into your application but whether the scopes in the token give them permission to access the data or take the action in their request. That way even if someone does guess an object reference or ID of some kind, they won’t be able to use it.

The best way to fix the problem I had with order IDs in my e-commerce system would be finer grained analysis of permissions before granting access to the data. When someone requested to view an order for a particular order ID, I should have checked to see if the logged in user was the person who submitted the order he or she wanted to view.

Exposed references

A few years ago a researcher was looking into how a product from a security vendor functioned. While digging into the network traffic, data, and code associated with the application, the security researcher discovered an AWS encryption key ID. Using that key ID, the researcher demonstrated that he could access and decrypt other people’s data.

I wrote an article about this breach and one of the programmers at that company thought that the fact that the researcher was able to view the AWS encryption key ID was not a problem. The researcher did not get access to the encryption key, only a key ID. What’s the difference? The encryption key ID allowed the programmer to decrypt other people’s data. It doesn’t matter if the key was protected because the key ID essentially provided the same functionality as the key itself.

Understand how references and pointers exposed to those who use your application may be a security problem. Understand what they can do with that object reference or pointer, should they be able to access it and take action with it.

In this instance, the problem was that the application developers used a single encryption key for every customer. They put the key ID in the source code delivered to the customers. They thought that because they didn’t give out the encryption key that they were protecting it. However, anyone who could access that encryption key ID could take the same actions as if they had the key itself.

Understand where you have object references in your application and whether or not they should be exposed to customers. Understand if those references may be exposed in other ways as the program executes. For example, if you put sensitive data such as session IDs in your application URLs and then you call third-party URLs on your web page, those session IDS may be in the referrer value sent to the third-party page. Someone maintaining that web application could access customer sessions. I see that frequently on penetration tests. Any sensitive data could also end up in system logs. Consider object references in error messages as well.

Stealing references

Similar to a key ID you may think that your system is secure because an attacker cannot steal the usernames and passwords required to log into the system. Even though you properly protect those mechanisms for logging in, an attacker may be able to obtain a type of reference to that username and login that allows taking action within a system.

When you log into some systems, including Microsoft operating systems, they don’t pass around a copy of your user name and password when you use the system to validate your permission to take actions. After the initial login, the system will store a hash or a token of some kind. That value is a reference to an active system user. That user has successfully proven who they are by logging in with a valid username and password.

Once logged in, a user has a token (or a value with some other name that represents an active user on the system.) It exists in memory. As I’ve already explained, attackers love getting access to system memory because it’s the one place your data needs to be unencrypted at some point.

Even if you use MFA to protect your systems, once a user has a token they generally aren’t re-authenticating and entering a second factor for every single action they take. That makes these tokens especially valuable to attackers. Attackers will try to use some of the vulnerabilities described in this book to try to gain access to systems, or to system memory. If they can install malware on a system, they can use that to gain access to different types of session tokens or hashes that identify active system users. Mimikatz is an example of a tool that tries to capture Windows credentials from system memory.

Carefully protect sensitive values that reference users and data. Do not expose them unnecessarily. Unfortunately these tokens need to be passed around in system memory to be useful. If an attacker gets into system memory, they could obtain a token. Although your code may be secure, some other code on the system may lead to an error that exposes system memory.

Due to the sensitivity of the data and potential harm their exposure may cause, take additional to make these tokens harder to use. Rotate the values frequently. Expire the old value so someone trying to use it gets blocked. By rotating the values frequently attackers will have a limited time to use stolen tokens. Of course, if they have access to your system, they can simply steal another token if you are not paying attention to this activity on the system.

Understand what is normal and abnormal use of the system. Make sure you report any form of anomalous access to references in your application logs (without exposing sensitive data). These anomalies should lead to further investigation to ensure someone has not stolen sensitive object references used to take critical actions or access sensitive data. For example, if you notice someone repeatedly trying to use an expired token, you probably want to investigate. It could be a bug, or something more nefarious.

Invalid Pointers

Numerous vulnerabilities exist related to invalid pointers. Different types of problems are broken down into individual CWEs (Common Weakness Enumerations) but they all boil down to one thing:

Know what your pointer is pointing to at all times and make sure it is valid as long as your pointer exists and may be used by your application.

Pointers are complicated. It will take a fair amount of logical thought to completely understand what pointers in your application point to at all times and ensure it is accurate.

I’m going to list some of the issues with pointers below and the related CWE. Each CWE has a list of related CVEs if you want to know more about how these problems may be abused. Successful abuse of any of the following could result in allowing an attacker to view sensitive data or execute code.

Understanding the value of your pointer starts from the point you create your pointer. Make sure you initialize it.

Don’t allow a pointer to reference a location in memory it should not be allowed to access.

I might add more in the book but for now, you can find more issues with pointers here.

One particular type of pointer error stood out while researching this post: Null Pointer Deference. When searching for CVEs related to pointers, almost all the vulnerabilities from 2021 fell into this category. I explained some of the issues with a null value in my last post on data types. Whenever performing operations on pointers, check first to make sure the value is not null. If it is and you aren’t expecting a null value, handle the error carefully.

You can review numerous examples of this type of vulnerability here:

Another even more common error related to pointers is known as a use-after-free vulnerability. This type of vulnerability occurs when a pointer points to a particular block of data, and then the data moves. Null pointers caused about 330 vulnerabilities at the time of this writing since January 1st, 2021. Use-after-free vulnerabilities occurred almost 700 times.

The types of errors that can occur from improper use of pointers can be broken into numerous categories, but they all relate to not understanding what a pointer points to at all times and ensuring the pointer and the value it points to are valid. Any time you have pointers or references of any kind in your application, ensure that the pointer cannot be referenced prematurely or after it is no longer valid. Ensure that the value of the pointer and what it points to is accurate at all times. Make sure people can’t steal pointers to access data or take actions that should not be allowed.

Next Steps

Review your code for the following errors and add protections against the following bugs and abuses:

  • Check that pointers and references are not null before you use them.
  • Ensure pointers and references are not referenced before they are initialized.
  • Ensure pointers and references cannot be accessed after they should no longer be in use.
  • Ensure the data a pointer or reference points to cannot be altered to invalid or inaccurate data.
  • Ensure the pointer cannot be altered to point to something it shouldn’t.
  • Protect sensitive pointers to ensure attackers cannot access them.
  • Track anomalous use of sensitive pointers or references and write alerts to logs when unexpected actions occur.

Follow for updates.

Teri Radichel | © 2nd Sight Lab 2022

About Teri Radichel:
~~~~~~~~~~~~~~~~~~~~
⭐️ Author: Cybersecurity Books
⭐️ Presentations: Presentations by Teri Radichel
⭐️ Recognition: SANS Award, AWS Security Hero, IANS Faculty
⭐️ Certifications: SANS ~ GSE 240
⭐️ Education: BA Business, Master of Software Engineering, Master of Infosec
⭐️ Company: Penetration Tests, Assessments, Phone Consulting ~ 2nd Sight Lab
Need Help With Cybersecurity, Cloud, or Application Security?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
🔒 Request a penetration test or security assessment
🔒 Schedule a consulting call
🔒 Cybersecurity Speaker for Presentation
Follow for more stories like this:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
❤️ Sign Up my Medium Email List
❤️ Twitter: @teriradichel
❤️ LinkedIn: https://www.linkedin.com/in/teriradichel
❤️ Mastodon: @teriradichel@infosec.exchange
❤️ Facebook: 2nd Sight Lab
❤️ YouTube: @2ndsightlab
Pointers
References
Application Security
Cybersecurity
Secure Coding
Recommended from ReadMedium