avatarTeri Radichel

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

5534

Abstract

ime.datetime.now()), <span class="hljs-string">'price'</span>: price, <span class="hljs-string">'item'</span>: item, <span class="hljs-string">'discount'</span>: discount }</pre></div><p id="7bae">In our case, the output is:</p><div id="477f"><pre>{ 'date': '<span class="hljs-number">2018-05-13</span> 13:37:21.<span class="hljs-number">414342</span>', 'price': '169.99', 'item': 'Dyson AM08 Bladeless Pedestal Fan | White/Silver | Refurbished', 'discount': '399.99 | 57% off' }</pre></div><h1 id="8e42">Step 2: Saving to S3</h1><p id="f971">To save our result to S3, we use <a href="https://boto3.readthedocs.io/en/latest/index.html">Boto 3</a>, the AWS SDK for Python. First we get a reference to S3. Then we create an <code>object</code> with given <code>bucket</code> and <code>file_name</code>(the bucket was created beforehand though this can be done programmatically). Finally, we convert our data to <code>json</code> and write our <code>data</code> into the object.</p><div id="acf2"><pre><span class="hljs-keyword">import</span> boto3 <span class="hljs-keyword">import</span> json</pre></div><div id="34fd"><pre>def <span class="hljs-built_in">save_file_to_s3</span>(bucket, file_name, data): s3 = boto3.<span class="hljs-built_in">resource</span>(<span class="hljs-string">'s3'</span>) obj = s3.<span class="hljs-built_in">Object</span>(bucket, file_name) obj.<span class="hljs-built_in">put</span>(Body=json.<span class="hljs-built_in">dumps</span>(data))</pre></div><p id="1bfc">We will package this utility function in the handler file where we house the actual function Lambda will call. More on this in the following steps.</p><p id="5919">Incidentally, if saving tabular data as in our example, we might choose a database instead of S3. We illustrate S3 here as it is also a good choice for documents, we are also frequently scraped items.</p><h1 id="dd59">Step 3: The Handler Function</h1><p id="6f6c">A Lambda function needs a <a href="https://docs.aws.amazon.com/lambda/latest/dg/python-programming-model-handler-types.html">handler function</a>, which is the function Lambda will execute when it gets called. We will put the handler function, along with our Python dependencies, in a sub-directory called <code>ebay_deal_scraper</code>. This will allow us to separate the files which will be part of the Lambda package, and ancillary project files such as the Serverless config file discussed in later steps.</p><p id="2abf">We name our handler function <code>scrape</code> and give it the signature required by Lambda. We don’t use the <code>event</code> or <code>context</code> parameters, but if you needed to pass data into your Lambda function, they are what you would use.</p><div id="2ee8"><pre><span class="hljs-title">def</span> scrape(event, context): <span class="hljs-class"><span class="hljs-keyword">data</span> = deal_scrape()</span> file_name = f<span class="hljs-string">"deals-{data['date']}"</span> save_file_to_s3('ebay_daily_deals', file_name, <span class="hljs-class"><span class="hljs-keyword">data</span>)</span></pre></div><p id="c7da">Our handler calls our <code>deal_scrape()</code> function, then writes the returned data to S3 under a file name based on the date.</p><h1 id="6b1f">Step 4: Packaging our Function</h1><p id="8815">Our custom code is ready to go, but Lambda also requires you include your dependencies in the package you upload to AWS. In our case this means <code>pip installing</code> our Python packages locally. In the <code>ebay_deal_scrapper</code> directory, we run:</p><div id="bc42"><pre>pip3 <span class="hljs-keyword">install </span>requests <span class="hljs-keyword">bs4 </span>-t .</pre></div><p id="2ef0">(if you have any problems, see this <a href="https://stackoverflow.com/questions/24257803/distutilsoptionerror-must-supply-either-home-or-prefix-exec-prefix-not-both">Stack Overflow issue</a>)</p><p id="8621">This will install the <code>requests</code> and <code>beatiful soup</code> packages in our directory. Lamdba has<code>boto3</code> pre-installed, so you don’t need to include it.</p><p id="bea3">Incidentally, including dependencies can be quite hairy if you require platform-dependent C/C++ libraries like <a href="https://www.boost.org/">Boost</a>, and may require you to use Docker to bundle everything together under the Amazon Linux OS that Lambda requires. But that’s the topic of another blog post.</p><p id="8188">When you have dependencies, you probably want to use a zip file to package everything up. We give ours the generic name of <code>package.zip</code>:</p><div id="f3ce"><pre><span class="hljs-built_in">zip</span> -r package.<span class="hljs-built_in">zip</span> *</pre></div><h1 id="b305">Step 5: Deploying to Lambda</h1><p id="6f52">We will use the Serverless framework to deploy to AWS. Serverless offers a set of command line tools which make it very easy to deploy to the major serverless cloud providers including AWS. You can install it using <code>npm</code>:</p><p id="8fc7"><code>npm install -g serverless</code></p><p id="110d">You specify Serverless deployment instructions in a file called <code>serverless.yml</code>. If you want to generate a boilerplate file, you can run:</p><div id="4295"><pre>serverless <span class="hljs-built_in">create</span> <span class="hljs-comment">--template aws-python</span></pre></div><p id="d62a">This will also create a boilerplate<code>handler.py</code> file, but all we need there is the function which we specify i

Options

n the <code>serverless.yml</code> config file under the <code>functions</code> section.</p><div id="7831"><pre><span class="hljs-symbol">service:</span> ebay-deal-scraper <span class="hljs-symbol"> provider:</span> <span class="hljs-symbol"> name:</span> aws <span class="hljs-symbol"> runtime:</span> python3<span class="hljs-number">.6</span> <span class="hljs-symbol"> package:</span> <span class="hljs-symbol"> artifact:</span> ebay_deal_scraper/package.zip <span class="hljs-symbol"> functions:</span> <span class="hljs-symbol"> ebay_scrape:</span> <span class="hljs-symbol"> handler:</span> handler.scrape</pre></div><p id="8b75">The <code>provider</code> section tells Serverless we’re deploying to AWS and that we’re using <code>python 3.6</code>.</p><p id="5fec">The <code>package</code> section is where we specify the zip file we created in the last step.</p><p id="5313">We are ready to deploy!</p><p id="273c">From the top directory of your project run:</p><div id="d1ed"><pre><span class="hljs-attribute">serverless deploy</span></pre></div><p id="c024">If you have multiple AWS profiles (such as work and personal), you can specify a profile:</p><div id="981f"><pre>serverless <span class="hljs-keyword">deploy</span> <span class="hljs-params">--aws-profile</span> profile_i_want_to_use</pre></div><p id="eed1">If all goes well, you should get the following output:</p><div id="4f5e"><pre>➜ ebay<span class="hljs-params">-deals</span><span class="hljs-params">-scrape</span> git:(master) ✗ sd -<span class="hljs-params">-aws</span><span class="hljs-params">-profile</span> michael Serverless: Packaging service<span class="hljs-params">...</span> Serverless: Uploading CloudFormation file <span class="hljs-keyword">to</span> S3<span class="hljs-params">...</span> Serverless: Uploading artifacts<span class="hljs-params">...</span> Serverless: Validating template<span class="hljs-params">...</span> Serverless: Updating <span class="hljs-built_in">Stack</span><span class="hljs-params">...</span> Serverless: Checking <span class="hljs-built_in">Stack</span> update progress<span class="hljs-params">...</span> <span class="hljs-params">...</span><span class="hljs-params">...</span><span class="hljs-params">...</span> Serverless: <span class="hljs-built_in">Stack</span> update finished<span class="hljs-params">...</span> Service Information service: ebay<span class="hljs-params">-deal</span><span class="hljs-params">-scraper</span> stage: dev region: us<span class="hljs-params">-east</span><span class="hljs-number">-1</span> <span class="hljs-built_in">stack</span>: ebay<span class="hljs-params">-deal</span><span class="hljs-params">-scraper</span><span class="hljs-params">-dev</span> api keys: <span class="hljs-literal">None</span> endpoints: <span class="hljs-literal">None</span> functions: ebay_scrape: ebay<span class="hljs-params">-deal</span><span class="hljs-params">-scraper</span><span class="hljs-params">-dev</span><span class="hljs-params">-ebay_scrape</span> Serverless: Removing old service versions<span class="hljs-params">...</span></pre></div><p id="d320">You can now try to invoke your function with:</p><div id="b911"><pre>serverless<span class="hljs-built_in"> invoke </span>-f ebay_scrape <span class="hljs-comment"># --aws_profile profile_name </span></pre></div><p id="4385">This will result in an AccessDenied error. To fix this, you need to add permission to your Lambda function’s role (this gets created together with the function).</p><h1 id="da14">Step 6: Giving Lambda S3 Privileges</h1><p id="b13d">Go to the Roles page in the IAM section of the AWS dashboard. Find the role for your Lambda function, in our case <code>ebay-deal-scraper-dev-us-east-1-lambdaRole</code> and add a policy which allows it to access S3. <code>AmazonS3FullAccess</code> will work for our demo, though in production you may want to create a new policy which is more restrictive.</p><p id="e0d1">Once your function is deployed you should also be able to view and test it from the AWS console.</p><figure id="dc68"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*A1uCWEB2Hb-YS7lMEiXCww.png"><figcaption></figcaption></figure><h1 id="6080">Step 7: Scheduling the Lambda Function Using CloudWatch</h1><p id="8cd3">Final step! Go to the CloudWatch Management Page and click the<code>Rules</code> tab. Under Event Source, select <code>Schedule</code> and fill in a cron expression. We set ours to run every day at 6PM GMT, or afternoon in US Eastern Time. Next, in the <code>Targets</code> section, choose <code>Lambda function</code> in the select and then your Lambda function from the list in the <code>Function</code> select. You’re done!</p><figure id="1595"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*amgaVqAlsAl0pQdYaAP_PA.png"><figcaption></figcaption></figure><p id="eb37">That’s the whirlwind tour of scraping the serverless way. I hope this gives a taste of the technologies involved, though each of them could be the subject of many, many blog posts.</p><p id="f860">As far as scraping goes, with the rise of big data/machine learning, data acquisition is becoming more and more important. And if, for instance, you wanted to do something like feed your machine learning model with data to <a href="https://www.dataquest.io/blog/machine-learning-tutorial/">make price predictions for AirBnB</a>, scraping might be your only option. The serverless architecture exemplified here offers an efficient way to do this.</p></article></body>

Faulty Security Control Logic

Just because a security control has a potential attack vector doesn’t make it useless

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

⚙️ Check out my series on Automating Cybersecurity Metrics. The Code.

🔒 Related Stories: Cloud Governance | Cybersecurity | Secure Code | Application Security

💻 Free Content on Jobs in Cybersecurity | ✉️ Sign up for the Email List

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I want to address something I keep reading regarding security controls and attack vectors.

Saying just because something might be abused via a particular attack vector means you shouldn’t use it is faulty logic.

I’ll give you an example. I’ve heard people say, “Oh no! Firewalls can be bypassed with tunnels. WAFs are useless because SSRF attacks!”

Well yes, various security controls can be bypassed by certain attacks, but that doesn’t mean they are irrelevant and useless against all attacks. A firewall can still block ports from any access or connections at all, and that is the point.

I wrote in my book how blocking 445 from being accessible from the Internet would have saved everyone attacked by the WannaCry ransomware. That includes the hospitals who had to stop admitting patents as a result of ransomware infiltrating their networks.

Although a more crafty attack might have been able to tunnel through the firewall in a sneakier manner, the attacker didn’t need to — because people just left ports exposed to the Internet that should have been restricted to private networks.

Firewalls stop attacks.

Firewalls don’t stop all attacks.

A firewall might not stop the type of ICMP tunneling I wrote about in my analysis of the Target Breach. Attackers exfiltrated data and sent commands via the ICMP protocol. You know, ping. If you need to allow ICMP your firewall might be susceptible to that fate. Attackers have also been known to exfiltrate data using DNS or perform attacks using ip options, NTP, or packet fragmentation. If an attacker can attack your web application via a valid channel then yes, your firewall will not help in that case because you need to allow the Internet to access port 443 and send responses.

There are other security controls for those attacks.

Actually, limiting your DNS traffic to specific DNS servers might help with DNS exfiltration as well as crafting your firewall rules to only allow certain ICMP types and codes. As for your application, a firewall might even help for certain web-based attacks, but yes, you likely need additional controls for that scenario. A penetration test might help.

Just because a particular security control is not foolproof doesn’t mean you should not use it. Security architecture never depends on one control. That’s why you need security architects who understand how all your controls work together.

This same case of faulty logic applies to the arguments that people misconfigure a security control all the time therefore we shouldn’t use it. Or that security control is hard, therefore we shouldn’t use it. What is the alternative? A data breach because something is too complicated or takes time? I wrote about that here:

Rather than saying something is hard or misconfigured or attacks exist so we shouldn’t use it — address the actual problem.

  • How can you make it easier without losing protection?
  • How do you prevent misconfigurations?
  • How do you address the gap that exists due to a particular attack vector without throwing away other protections?

The ultimate question is — will your change make it easier for attackers to infiltrate your systems? If you give up on a control because it has a particular attack vector, how many other attack vectors have you introduced that didn’t exist when you had that control in place?

Follow for updates.

Teri Radichel | © 2nd Sight Lab 2023

The best way to support this blog is to sign up for the email list and clap for stories you like. That also helps me determine what stories people like and what to write about more often. Other ways to follow and support are listed below. Thank you!

About Teri Radichel:
~~~~~~~~~~~~~~~~~~~~
Author: Cybersecurity for Executives in the Age of Cloud
Presentations: Presentations by Teri Radichel
Recognition: SANS Difference Makers Award, AWS Security Hero, IANS Faculty
Certifications: SANS
Education: BA Business, Master of Software Engineering, Master of Infosec
Company: Cloud Penetration Tests, Assessments, Training ~ 2nd Sight Lab
Like this story? Use the options below to help me write more!
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
❤️ Clap
❤️ Referrals
❤️ Medium: Teri Radichel
❤️ Email List: Teri Radichel
❤️ Twitter: @teriradichel
❤️ Mastodon: @[email protected]
❤️ Facebook: 2nd Sight Lab
❤️ YouTube: @2ndsightlab
❤️ Buy a Book: Teri Radichel on Amazon
❤️ Request a penetration test, assessment, or training
 via LinkedIn: Teri Radichel 
❤️ Schedule a consulting call with me through IANS Research

My Cybersecurity Book: Cybersecurity for Executives in the Age of Cloud

Secrurity
Controls
Cybersecurity
Cloud
Security
Recommended from ReadMedium