Category Archive for: Python

rfp-robotRFP ROBOT: Website Request for Proposal Generator

The time has come for a new website (or website redesign), which means you need to write a website request for proposal or web RFP. A Google search produces a few examples, but they vary wildly and don’t seem to speak really to your goals for developing or redesigning a new website. You need to write a website RFP that will clearly articulate your needs and generate responses from the best website designers and developers out there. But how?

Have no fear, RFP Robot is here. He will walk you through a step-by-step process to help you work through the details of your project and create a PDF formatted website design RFP that will provide the information vendors need to write an accurate bid. RFP Robot will tell you what info you should include, point out pitfalls, and give examples.


Natural Language Processing of Viget.com Articles

In a previous article, I talked about scraping article data from Viget.com. I pulled the title, author name, hashtags, date, and text from each of our posts. My goal was to analyze the text data using topic modeling and word embeddings, in an attempt to learn more about the type of content we are producing on Viget.com. Natural language processing is a large field of study and the techniques I walk through in this article are just the tip of the iceberg. I will go over preprocessing text data, vectorizers, topic modeling, and word embeddings at a high level. Hopefully, this article gets the wheels turning in your mind about how you too can analyze your large bodies of text and gain understanding with natural language processing. Unlike using Bayesian models to forecast customer lifetime value, these natural language processing techniques may have less-obvious use cases for business-minded readers. But…

Read More →

Web Scraping with BeautifulSoup

Python’s BeautifulSoup library makes scraping web data a breeze. With a basic understanding of HTML and Python, you can pull all the data you need from web pages. In this article, I go through an example of web scraping by pulling text data from Viget.com. Warning Before you begin scraping a site, make sure not to violate the site’s Terms of Service. Don’t violate the rules in the site’s robots.txt, and don’t use an overly aggressive crawl rate. Read Benoit Bernard’s blog post about the legality behind web scraping before you start, and consult your legal team before scraping or crawling a site. End of warning The easiest sites to scrape are those with a consistent HTML structure. Let’s look at Viget.com as an example. If I wanted to pull the author name from each article, I could search each document for the class name ‘credit__author-name,’ and I would find…

Read More →

Defending Against a Self-Propagating Drupal Botnet Attack

On the 28th of March 2018 the Drupal Security Team announced SA-CORE-2018-002, a serious Remote Code Execution vulnerability, which came to be known by many as “Drupalgeddon 2”. The patches to Drupal core were quite simple and Acquia implemented a platform-level mitigation within a few hours of the announcement. However, it was not trivial to reverse engineer the actual exploit(s), and it took a couple of weeks for credible Proof of Concept exploits to be published. Of course, once they were, exploit attempts were seen in the wild shortly after. Approximately one month after the initial announcement, Acquia’s Incident Response (IR) team began to see evidence of an attempt to exploit the vulnerability apparently coordinated from a single IP address belonging to a fairly well know French “cloud computing company”. The real server IP has been changed in this write-up. From the Drupal site logs, the attack followed a familiar…

Read More →

Melodies and Metrics: Musical Attributes and the Stories They Tell

Yasiin Bey, formerly known as Mos Def, once advised that if “you want to know how to rhyme, you better learn how to add — it’s mathematics.” The music-as-math analogy extends to my university’s campus, where Penn allows both “Calculus I” and “Making Sense of Music” to satisfy the Formal Reasoning & Analysis requirement. Machine learning has even allowed a computer to imitate Bach. So, with the hope of better understanding my favorite artists and possibly even finding some new jams, I decided to analyze the data buried within my music. A couple of existing datasets attempt to quantify a song’s sound. One such example, Pandora’s vaunted Music Genome Project, unfortunately keeps its data confidential. Luckily, in 2014, Spotify acquired a music data analysis company, The Echo Nest, and now makes available on its public API a set of automatically generated “audio features.” The list of audio features includes not only objective statistics…

Read More →

Learn Data Science: My Favorite Resources

When I started learning about data science, I was overwhelmed by the ocean of resources available online. Thankfully, a few practicing data scientists and professors guided me in the right direction. Below is a list of resources that I found most useful — hopefully they will kickstart your data science fascination, as they did for me. Python If you are completely new to programming, learning the basics of Python on Codecademy is your most-logical first step. You don’t need to be a software developer to practice data science, but you should work to become proficient at programming. As you grow your data science career, expect your programming skills to also grow. Data Camp is a great introduction to applying Python for data science. They have many courses that will help you nail down the basics of data science. Data Camp is not free, but its pricing is approachable at $30…

Read More →

Using Apache Spark for Data Processing: Lessons Learned

As a Data Scientist at Acquia I get to build machine learning models to solve problems or speed up tasks that are time consuming for humans. This means I spend a lot of time getting data into a format that is usable by machine learning models, or even just putting it in a useful form for exploratory analysis. One of the tools I use for handling large amounts of data and getting it into the required format is Apache Spark. Spark is an open source analytics engine for large scale data processing that allows data to be processed in parallel across a cluster. I use it in combination with AWS Elastic MapReduce (EMR) instances which provide more computing resources than my laptop can provide. I work with events related to web browsing, such as what people click on, the timestamp on those clicks, and the web browser they are using.…

Read More →

Foreword for CSS In Depth

Keith Grant recently released a brand new book on CSS: CSS in Depth. If you’re looking for a book focused specifically on learning CSS, you’ve found it. I was happy to write the foreword for it, which I’ll republish here. “A minute to learn… A lifetime to master.” That phrase might feel a little trite these days, but I still like it. It was popularized in modern times by being the tagline for the board game Othello. In Othello, players take turns placing white or black pieces onto a grid. If, for example, a white piece is played trapping a row of black pieces between two white, all the black pieces are flipped and the row becomes entirely white. Like Othello, it isn’t particularly hard to learn the rules of CSS. You write a selector that attempts to match elements, then you write key/value pairs that style those elements. Even…

Read More →

Acquia blocks 500,000 attack attempts for SA-CORE-2018-002

On March 28th, the Drupal Security Team released a bug fix for a critical security vulnerability, named SA-CORE-2018-002. Over the past week, various exploits have been identified, as attackers have attempted to compromise unpatched Drupal sites. Hackers continue to try to exploit this vulnerability, and Acquia’s own security team has observed more than 100,000 attacks a day. The SA-CORE-2018-002 security vulnerability is highly critical; it allows an unauthenticated attacker to perform remote code execution on most Drupal installations. When the Drupal Security Team made the security patch available, there were no publicly known exploits or attacks against SA-CORE-2018-002. That changed six days ago, after Checkpoint Research provided a detailed explanation of the SA-CORE-2018-002 security bug, in addition to step-by-step instructions that explain how to exploit the vulnerability. A few hours after Checkpoint Research’s blog post, Vitalii Rudnykh, a Russian security researcher, shared a proof-of-concept exploit on GitHub. Later that day,…

Read More →

Acquia blocks 500,000 attack attempts for SA-CORE-2018-002

On March 28th, the Drupal Security Team released a bug fix for a critical security vulnerability, named SA-CORE-2018-002. Over the past week, various exploits have been identified, as attackers have attempted to compromise unpatched Drupal sites. Hackers continue to try to exploit this vulnerability, and Acquia’s own security team has observed more than 100,000 attacks a day. The SA-CORE-2018-002 security vulnerability is highly critical; it allows an unauthenticated attacker to perform remote code execution on most Drupal installations. When the Drupal Security Team made the security patch available, there were no publicly known exploits or attacks against SA-CORE-2018-002. That changed six days ago, after Checkpoint Research provided a detailed explanation of the SA-CORE-2018-002 security bug, in addition to step-by-step instructions that explain how to exploit the vulnerability. A few hours after Checkpoint Research’s blog post, Vitalii Rudnykh, a Russian security researcher, shared a proof-of-concept exploit on GitHub. Later that day,…

Read More →

Talks, Thoughts, and Texas: Viget at SxSw 2018

While Olympics highlights and Valentine’s day memories are fresh in our minds, I’m here to ease you into the impending month of March. Not for the basketball madness, or St. Patrick’s day traditions — but for the tech tradition of SXSW and next week’s festivities. And in what will be our third consecutive year with multiple talks, we’ll be sending our own small crew to Texas — including some fresh faces — for the knowledge, for the sharing, and for the free things they hand you while walking around. In addition to our two workshops, here are a few talks, and thoughts on SXSW 2018: Thought: “I think the implications of AI are growing and being discovered at, or behind the pace of, AI tech which makes it an increasingly interesting, albeit a little scary at times, technology to learn about and work with.” – Ian Brennan, Viget Developer Talk: Regulating AI:…

Read More →

Meetings are Toxic

Meetings are one of the worst kinds of workplace interruptions. They’re held too frequently, run too long, and involve more people than necessary. You may have gathered that we really dislike meetings at Basecamp. And many of you do too! This episode of Rework features:A group of philosophy professors in a meeting they Kant seem to end. You might say it had…No Exit. One attendee, at least, found enough Hume-r in it to tell us about it.A meeting about a meeting.A dramatic reading about conference calls from hell.Basecamp programmer Dan Kim talking about his post on recurring meetings and what you—yes, you!—can do to start changing the ingrained culture of meetings at your company.A brief, pedantic aside to note the difference between garters and garter belts.A cringeworthy meeting with an unwanted participant—and an unexpected outcome.https://medium.com/media/582f789d4b9acad33967b36d02b0b24f/hrefWe had more listener-submitted meeting stories than we could feature in the episode, so here are a couple bonus ones!No Work DoneI had…

Read More →

Full-Stack Product Developer – Treeline Interactive – San Diego, CA

Treeline Interactive is searching for a Web Developer to join our growing Development Team. Flash, PHP, Python, Javascript, Node.js, Drupal, Laravel, WordPress,…From Treeline Interactive – Sat, 23 Sep 2017 06:46:46 GMT – View all San Diego, CA jobs Source: http://rss.indeed.com/rss?q=Drupal+Developer

Senior LAMP/Python Developer – POP – Seattle, WA

Preferred experience with CMS systems like WordPress, Expression Engine, Drupal, and Django. Software development often takes on the most challenging…From POP – Thu, 31 Aug 2017 21:18:03 GMT – View all Seattle, WA jobs Source: http://rss.indeed.com/rss?q=Drupal+Developer

Want to expand your Google Analytics skills or land a full-time job? Start here.

People often contact Viget about our analytics training offerings. Because the landscape has changed significantly over the past few years, so has our approach. Here’s my advice for learning analytics today. We’ll break this article into two parts — choose which part is best for you: 1. I’m in a non-analytics role at my organization and looking to become more independent with analytics. 2. I’d like to become a full-time analyst in an environment like Viget’s, either as a first-time job or as a career change. “I’m in a non-analytics role at my organization and looking to become more independent with analytics.” Great! One more question — do you want to learn about data analysis or configuring new tracking? Data Analysis: At Viget, we used to offer full-day public trainings where we covered everything from beginner terminology to complex analyses. Over the past few years, however, Google has significantly improved…

Read More →

Improving Conversations using the Perspective API

I recently came across an article by Rory Cellan-Jones about a new technology from Jigsaw, a development group at Google focused on making people safer online through technology. At the time they’d just released the first alpha version of what they call The Perspective API. It’s a machine learning tool that is designed to rate a string of text (i.e. a comment) and provide you with a Toxicity Score, a number representing how toxic the text is. The system learns by seeing how thousands of online conversations have been moderated and then scores new comments by assessing how “toxic” they are and whether similar language had led other people to leave conversations. What it’s doing is trying to improve the quality of debate and make sure people aren’t put off from joining in. As the project is still in its infancy it doesn’t do much more than that. Still, we…

Read More →

Building a Simple API with Amazon Lambda and Zappa

We recently had a client come to us with a request for a simple serverless API. They wanted little to no administrative overhead, so we went with the AWS Lambda service. It was my first foray with Lambda, and getting it set up came with its fair share of headaches. If you’re starting down the same path and want to build a simple API with Lambda, here’s a tutorial to help. Github If you would rather go through the tutorial on github, you can find it here AWS Lambda This is a great service offered by AWS that allows users to run a serverless application or function. It’s a cloud-based, serverless architecture that comes with continuous scaling out of the box. Deploy your code, and AWS does the rest. It will only run when “triggered,” either by another AWS service, or an HTTP call. It’s relatively young and has room…

Read More →

Programming languages aren’t a zero sum game

Stop me if you’ve heard these before when people get to talking about programming languages…“These features are copied this from <superior language>.”“Nothing new here. <superior language> has done this for years.”“This language has nothing on <superior language>, but nobody realizes it.”“<superior language> does the same thing, but better.”I bring it up because I’ve been reading and writing a lot about Kotlin lately. And invariably someone posts a snarky comment like one those above, carrying with it a clear innuendo: my preferred programming language is better than yours.And every time I see those I leave with the same reaction. Who gives a shit?Now I’m not talking about people who are having constructive conversations or even just poking fun. Hell, I may have been known to take a jab at Java every once in a while. 👊I’m talking about a subset of programmers who treat languages like it’s a zero sum game — that for one language to succeed, another (or all…

Read More →

Python Developer – Kaizen Technologies Inc – Trevose, PA

Python Developer Location:. ⿢ Expert in 70% Python, 30% PHP with knowledge of at least one Python web framework {{such as Django, Flask, etc. depending on your…From conrep – Wed, 28 Jun 2017 20:01:32 GMT – View all Trevose, PA jobs Source: http://rss.indeed.com/rss?q=Drupal+Developer

Web Site Engineer – BLH Technologies, Inc. – Rockville, MD

Development/design experience using Drupal, JBoss, REST, Python and MYSQL. Work with team of developers and engineers in building scripts for continuous…From BLH Technologies, Inc. – Wed, 28 Jun 2017 09:52:52 GMT – View all Rockville, MD jobs Source: http://rss.indeed.com/rss?q=Drupal+Developer

Back to Top