avatarTeri Radichel

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

5534

Abstract

ime.datetime.now()), <span class="hljs-string">'price'</span>: price, <span class="hljs-string">'item'</span>: item, <span class="hljs-string">'discount'</span>: discount }</pre></div><p id="7bae">In our case, the output is:</p><div id="477f"><pre>{ 'date': '<span class="hljs-number">2018-05-13</span> 13:37:21.<span class="hljs-number">414342</span>', 'price': '169.99', 'item': 'Dyson AM08 Bladeless Pedestal Fan | White/Silver | Refurbished', 'discount': '399.99 | 57% off' }</pre></div><h1 id="8e42">Step 2: Saving to S3</h1><p id="f971">To save our result to S3, we use <a href="https://boto3.readthedocs.io/en/latest/index.html">Boto 3</a>, the AWS SDK for Python. First we get a reference to S3. Then we create an <code>object</code> with given <code>bucket</code> and <code>file_name</code>(the bucket was created beforehand though this can be done programmatically). Finally, we convert our data to <code>json</code> and write our <code>data</code> into the object.</p><div id="acf2"><pre><span class="hljs-keyword">import</span> boto3 <span class="hljs-keyword">import</span> json</pre></div><div id="34fd"><pre>def <span class="hljs-built_in">save_file_to_s3</span>(bucket, file_name, data): s3 = boto3.<span class="hljs-built_in">resource</span>(<span class="hljs-string">'s3'</span>) obj = s3.<span class="hljs-built_in">Object</span>(bucket, file_name) obj.<span class="hljs-built_in">put</span>(Body=json.<span class="hljs-built_in">dumps</span>(data))</pre></div><p id="1bfc">We will package this utility function in the handler file where we house the actual function Lambda will call. More on this in the following steps.</p><p id="5919">Incidentally, if saving tabular data as in our example, we might choose a database instead of S3. We illustrate S3 here as it is also a good choice for documents, we are also frequently scraped items.</p><h1 id="dd59">Step 3: The Handler Function</h1><p id="6f6c">A Lambda function needs a <a href="https://docs.aws.amazon.com/lambda/latest/dg/python-programming-model-handler-types.html">handler function</a>, which is the function Lambda will execute when it gets called. We will put the handler function, along with our Python dependencies, in a sub-directory called <code>ebay_deal_scraper</code>. This will allow us to separate the files which will be part of the Lambda package, and ancillary project files such as the Serverless config file discussed in later steps.</p><p id="2abf">We name our handler function <code>scrape</code> and give it the signature required by Lambda. We don’t use the <code>event</code> or <code>context</code> parameters, but if you needed to pass data into your Lambda function, they are what you would use.</p><div id="2ee8"><pre><span class="hljs-title">def</span> scrape(event, context): <span class="hljs-class"><span class="hljs-keyword">data</span> = deal_scrape()</span> file_name = f<span class="hljs-string">"deals-{data['date']}"</span> save_file_to_s3('ebay_daily_deals', file_name, <span class="hljs-class"><span class="hljs-keyword">data</span>)</span></pre></div><p id="c7da">Our handler calls our <code>deal_scrape()</code> function, then writes the returned data to S3 under a file name based on the date.</p><h1 id="6b1f">Step 4: Packaging our Function</h1><p id="8815">Our custom code is ready to go, but Lambda also requires you include your dependencies in the package you upload to AWS. In our case this means <code>pip installing</code> our Python packages locally. In the <code>ebay_deal_scrapper</code> directory, we run:</p><div id="bc42"><pre>pip3 <span class="hljs-keyword">install </span>requests <span class="hljs-keyword">bs4 </span>-t .</pre></div><p id="2ef0">(if you have any problems, see this <a href="https://stackoverflow.com/questions/24257803/distutilsoptionerror-must-supply-either-home-or-prefix-exec-prefix-not-both">Stack Overflow issue</a>)</p><p id="8621">This will install the <code>requests</code> and <code>beatiful soup</code> packages in our directory. Lamdba has<code>boto3</code> pre-installed, so you don’t need to include it.</p><p id="bea3">Incidentally, including dependencies can be quite hairy if you require platform-dependent C/C++ libraries like <a href="https://www.boost.org/">Boost</a>, and may require you to use Docker to bundle everything together under the Amazon Linux OS that Lambda requires. But that’s the topic of another blog post.</p><p id="8188">When you have dependencies, you probably want to use a zip file to package everything up. We give ours the generic name of <code>package.zip</code>:</p><div id="f3ce"><pre><span class="hljs-built_in">zip</span> -r package.<span class="hljs-built_in">zip</span> *</pre></div><h1 id="b305">Step 5: Deploying to Lambda</h1><p id="6f52">We will use the Serverless framework to deploy to AWS. Serverless offers a set of command line tools which make it very easy to deploy to the major serverless cloud providers including AWS. You can install it using <code>npm</code>:</p><p id="8fc7"><code>npm install -g serverless</code></p><p id="110d">You specify Serverless deployment instructions in a file called <code>serverless.yml</code>. If you want to generate a boilerplate file, you can run:</p><div id="4295"><pre>serverless <span class="hljs-built_in">create</span> <span class="hljs-comment">--template aws-python</span></pre></div><p id="d62a">This will also create a boilerplate<code>handler.py</code> file, but all we need there is the function which we specify i

Options

n the <code>serverless.yml</code> config file under the <code>functions</code> section.</p><div id="7831"><pre><span class="hljs-symbol">service:</span> ebay-deal-scraper <span class="hljs-symbol"> provider:</span> <span class="hljs-symbol"> name:</span> aws <span class="hljs-symbol"> runtime:</span> python3<span class="hljs-number">.6</span> <span class="hljs-symbol"> package:</span> <span class="hljs-symbol"> artifact:</span> ebay_deal_scraper/package.zip <span class="hljs-symbol"> functions:</span> <span class="hljs-symbol"> ebay_scrape:</span> <span class="hljs-symbol"> handler:</span> handler.scrape</pre></div><p id="8b75">The <code>provider</code> section tells Serverless we’re deploying to AWS and that we’re using <code>python 3.6</code>.</p><p id="5fec">The <code>package</code> section is where we specify the zip file we created in the last step.</p><p id="5313">We are ready to deploy!</p><p id="273c">From the top directory of your project run:</p><div id="d1ed"><pre><span class="hljs-attribute">serverless deploy</span></pre></div><p id="c024">If you have multiple AWS profiles (such as work and personal), you can specify a profile:</p><div id="981f"><pre>serverless <span class="hljs-keyword">deploy</span> <span class="hljs-params">--aws-profile</span> profile_i_want_to_use</pre></div><p id="eed1">If all goes well, you should get the following output:</p><div id="4f5e"><pre>➜ ebay<span class="hljs-params">-deals</span><span class="hljs-params">-scrape</span> git:(master) ✗ sd -<span class="hljs-params">-aws</span><span class="hljs-params">-profile</span> michael Serverless: Packaging service<span class="hljs-params">...</span> Serverless: Uploading CloudFormation file <span class="hljs-keyword">to</span> S3<span class="hljs-params">...</span> Serverless: Uploading artifacts<span class="hljs-params">...</span> Serverless: Validating template<span class="hljs-params">...</span> Serverless: Updating <span class="hljs-built_in">Stack</span><span class="hljs-params">...</span> Serverless: Checking <span class="hljs-built_in">Stack</span> update progress<span class="hljs-params">...</span> <span class="hljs-params">...</span><span class="hljs-params">...</span><span class="hljs-params">...</span> Serverless: <span class="hljs-built_in">Stack</span> update finished<span class="hljs-params">...</span> Service Information service: ebay<span class="hljs-params">-deal</span><span class="hljs-params">-scraper</span> stage: dev region: us<span class="hljs-params">-east</span><span class="hljs-number">-1</span> <span class="hljs-built_in">stack</span>: ebay<span class="hljs-params">-deal</span><span class="hljs-params">-scraper</span><span class="hljs-params">-dev</span> api keys: <span class="hljs-literal">None</span> endpoints: <span class="hljs-literal">None</span> functions: ebay_scrape: ebay<span class="hljs-params">-deal</span><span class="hljs-params">-scraper</span><span class="hljs-params">-dev</span><span class="hljs-params">-ebay_scrape</span> Serverless: Removing old service versions<span class="hljs-params">...</span></pre></div><p id="d320">You can now try to invoke your function with:</p><div id="b911"><pre>serverless<span class="hljs-built_in"> invoke </span>-f ebay_scrape <span class="hljs-comment"># --aws_profile profile_name </span></pre></div><p id="4385">This will result in an AccessDenied error. To fix this, you need to add permission to your Lambda function’s role (this gets created together with the function).</p><h1 id="da14">Step 6: Giving Lambda S3 Privileges</h1><p id="b13d">Go to the Roles page in the IAM section of the AWS dashboard. Find the role for your Lambda function, in our case <code>ebay-deal-scraper-dev-us-east-1-lambdaRole</code> and add a policy which allows it to access S3. <code>AmazonS3FullAccess</code> will work for our demo, though in production you may want to create a new policy which is more restrictive.</p><p id="e0d1">Once your function is deployed you should also be able to view and test it from the AWS console.</p><figure id="dc68"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*A1uCWEB2Hb-YS7lMEiXCww.png"><figcaption></figcaption></figure><h1 id="6080">Step 7: Scheduling the Lambda Function Using CloudWatch</h1><p id="8cd3">Final step! Go to the CloudWatch Management Page and click the<code>Rules</code> tab. Under Event Source, select <code>Schedule</code> and fill in a cron expression. We set ours to run every day at 6PM GMT, or afternoon in US Eastern Time. Next, in the <code>Targets</code> section, choose <code>Lambda function</code> in the select and then your Lambda function from the list in the <code>Function</code> select. You’re done!</p><figure id="1595"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*amgaVqAlsAl0pQdYaAP_PA.png"><figcaption></figcaption></figure><p id="eb37">That’s the whirlwind tour of scraping the serverless way. I hope this gives a taste of the technologies involved, though each of them could be the subject of many, many blog posts.</p><p id="f860">As far as scraping goes, with the rise of big data/machine learning, data acquisition is becoming more and more important. And if, for instance, you wanted to do something like feed your machine learning model with data to <a href="https://www.dataquest.io/blog/machine-learning-tutorial/">make price predictions for AirBnB</a>, scraping might be your only option. The serverless architecture exemplified here offers an efficient way to do this.</p></article></body>

GitHub Enterprise and Organization IP Allow Lists

Make sure you understand and manage all of them

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

⚙️ Part of my series on Automating Cybersecurity Metrics. The Code.

🔒 Related Stories: GitHub Security | Application Security | Secure Code

💻 Free Content on Jobs in Cybersecurity | ✉️ Sign up for the Email List

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

At some point GitHub popped up a message when I logged in and asked me if I wanted to create an “Enterprise” or something to that effect.

What’s this? Sure why not if it doesn’t cost anything. I imagine it’s a way to have an enterprise with different GitHub organizations for different lines of business. That makes sense. You might want to allow different people to control the GitHub settings for different parts of the organization.

I do these things to test configuring and security of certain products like GitHub, Okta, AWS and other cloud services. I didn’t know what would happen to my account when I clicked that button.

What it did was to create an enterprise level context above my existing organization. When you login you switch to whichever context you want to manage at the top of the screen.

What I also did not initially realize is that I had two different IP allow lists and somehow I ended up with IPs in both lists. I’m not sure how pushing that button determined which IPs should be in what list or I didn’t realize what context I was in when I added some new IPs.

I wrote about using an AWS EIP with GitHub here:

So it could be that once you set up an enterprise in GitHub you are looking at the enterprise context and thinking you know all the IPs that have access to your account, but that might not be the case. You also have to look at each individual organization context for that information.

One thing I don’t understand is why GitHub puts this under “Authentication Security” instead of “Network Security” but that’s where you’ll find it once you switch to your organization or enterprise context:

The other thing is, you can’t enable or disable IP AllowLists using your organizational context. You have to do that from the Enterprise context:

I also don’t know why it says my IdP is not supported since I did not add an IdP but that’s another story.

If you head over to your Enterprise context it looks similar but slightly different and you can check or uncheck the boxes related to the IP Allow List. But be aware that others may be able to add additional IP addresses under the organization settings as well.

Follow for updates.

Teri Radichel | © 2nd Sight Lab 2024

About Teri Radichel:
~~~~~~~~~~~~~~~~~~~~
⭐️ Author: Cybersecurity Books
⭐️ Presentations: Presentations by Teri Radichel
⭐️ Recognition: SANS Award, AWS Security Hero, IANS Faculty
⭐️ Certifications: SANS ~ GSE 240
⭐️ Education: BA Business, Master of Software Engineering, Master of Infosec
⭐️ Company: Penetration Tests, Assessments, Phone Consulting ~ 2nd Sight Lab
Need Help With Cybersecurity, Cloud, or Application Security?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
🔒 Request a penetration test or security assessment
🔒 Schedule a consulting call
🔒 Cybersecurity Speaker for Presentation
Follow for more stories like this:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
❤️ Sign Up my Medium Email List
❤️ Twitter: @teriradichel
❤️ LinkedIn: https://www.linkedin.com/in/teriradichel
❤️ Mastodon: @teriradichel@infosec.exchange
❤️ Facebook: 2nd Sight Lab
❤️ YouTube: @2ndsightlab
Github
Security
Ip Allow List
Application Security
Cybersecurity
Recommended from ReadMedium