Using big data to generate content ideas | A SOP Story

TL;DR? Big data knows what questions people have, and you can capitalize on that to generate an endless list of answer-focused content ideas that draw from and showcase your experience, knowledge, and expertise. Scroll down to the section titled ‘Bringing it together’ for a quick list of the steps. Scroll back up for the nuance.

There’s nothing new under the sun, so what could you possibly have to offer the world that hasn’t been addressed before?

I’m always a big advocate of asking people for help when I get stuck, and, as it turns out, people are telling you what they want you to talk to them about all of the time. If you’re systematic about how you listen, you’ll build up quite a data store to flip through the next time you’re missing your muse and staring down a blank page and a deadline.

This SOP Story is a blend of my own experience using a paper notebook (I’ll explain why it’s paper below) and one of my favorite content idea tools, Answer the Public. I won’t go as far as refining a topic, or optimizing content for SEO. This is about idea generation, that is, brainstorming using a mix of your own history and the history of others as captured in big data. So, it’s really a story about a data process, and it’s the process I use to generate potential content ideas for my monthly video series, SOP Stories, and newsletter tips.

An idea notebook to record your own conversation data

This story starts with a paper notebook, one filled with blank, lined sheets. I use it to store random snippets of conversations that light my creative fire with respect to my favorite professional topic – data strategy – and that tie back to a question someone asked me directly or where I utilized my experience, knowledge, and expertise to help clarify something or respond to something. Most of these notes are fairly short: “bias a la LinkedIn exchange with Kim” or “backfilling data.”

Why am I using paper? Simply, seeing the scribble and pen color and where it is on the page and how it’s juxtaposed with other things takes me back to that moment when I decided that this was an idea worth jotting down, and I can pick up the threads of the thought from there. When I tried digital formats, all of my notes became oddly useless.

What makes this count as data? As I noted in my video on Where data comes from…, data is just some bits of stuff we collect, a set of observable characteristics (i.e., variables and clues). In this case, the data is snippets of conversations. Data becomes information when we attach some meaning to it, usually with respect to helping us understand the world, interacting, or making decisions. As you might have guessed, the mental triggers that I get from looking over my paper notes are the first part of turning this data into information.

The notebook is just a starting point. I want to find the larger, public conversations on the topics in my data, and join them in a way that people care about, that is, in a way that answers their questions. For that, I need more conversation data.

A big data tool to eavesdrop on the conversations of others

It’s at this point that this story segues from my private notebook to the world of big data, specifically In their own words, “AnswerThePublic listens into autocomplete data from search engines like Google then quickly cranks out every useful phrase and question people are asking around your keyword.” It’s a web-based tool, and you don’t need to create an account or sign up for anything to use it. If you’re using the free version, you can run up to two searches per day from your location.

Visually, they dump out various “wheels” (see below) around the one or two word phrase that you enter, classified in various ways (e.g., questions like how, what, which), and color coded according to relative popularity (i.e., darker green means more popular at the moment of your search). Take note on what I just said about popularity – if you run the search at another time of year, or in a different year, you might end up with different results, because what’s popular and searched for changes over time. If you upgrade to the paid pro version, they’ll save searches for you, which can be handy for comparing popularity trends over time, but I use other tools for that. (Technical note: Using the free version, I save my search results as CSV files, which contain a list of the results, which I process in MS Excel; popularity ranks are implied by the order of the exported results but are not made explicit in a data column, so you’ll “lose” that information if you sort without processing the data for this first.)

Let’s run through an example, starting with “backfilling data” from my notebook. Putting this into AnswerThePublic yielded some disappointingly scanty results, as evidenced by the question wheel portion of the results.

question wheel showing 4 questions for how, 3 questions for what, and nothing else

The issue of backfilling data (i.e., going in after-the-fact to fill in data that, for whatever reason, wasn’t recorded) is actually a missing data problem, so I searched again on “missing data” with much better question wheel results. 

question wheel showing at least one question for all questions: how, why, can, when, will, who, what, where, are, and which

Now, what’s really nice is that you can click on any of the results in a wheel and be taken to the Google search for that result in a new browser tab. There, you can take a look at the useful “People also ask” section.

Clicking on “why missing data is a problem” yielded the following results:

  • Why are missing values bad?
  • What are the causes of missing data?
  • Is missing data a problem in regression?
  • How does missing data cause bias?

Personally, I’m very intrigued by that last one. It not only ties into the topic I’m currently investigating for inspiration (including its origin story!), but also to the other bit of data I pulled from my notebook earlier in this post: “bias a la LinkedIn exchange with Kim.” These are conversations I have in real life, and that tie back to my experience, knowledge, and expertise. This is now the starting point for a real content idea that will tie me into a larger, public conversation on a topic that inspired me during a smaller, private conversation. Maybe it will end up being a future video or a newsletter tip.

Bringing it together, SOP-style – Being a part of the conversation

Since this is a SOP Story about content idea generation, I’ll summarize the two processes that I think relate to this topic, and which can be placed within your larger SOPs on generating the content itself.

Setting yourself up for success through conversation-related data collection:

  • Record notes about the conversations you have with clients and others regarding your niche – your experiences, knowledge, and expertise. Make these notes as extensive as they need to be to jog your memory later. Feel free to include links, pictures, and whatever else helps you capture that moment.

Looking for inspiration from your data:

  • When you’re searching for ideas, flip through those notes to see what captures your fancy, or what you at least think you can work up into a content piece now that it’s been brought to your attention again.
  • Pick and rephrase an idea into one or two words.
  • Plug that into
  • Scan the results for questions people might have about that topic, popping into the Google search results for People also ask.
  • Take notes.

Creating Your Own Mashup

Do you need to start using a paper notebook or AnswerThePublic? No. I think it helps to find out what other people do, and try stuff out so you can settle on the mashup of techniques that work for you. If you’re faced with the need to regularly generate content, I highly recommend trying out both a notebook for a month, either digital or paper, and an AnswerThePublic search or two, to see if they aid your content idea generation workflow.

Barbara Olsafsky

Owner and Data Wrangler/Strategist

Leave a Reply

Your email address will not be published. Required fields are marked *

Post comment