(Solved) : Suppose Accumulo System Stores Entire Documents Value Column Using Following Data Structur Q39174642 . . .

Suppose our Accumulo system stores entire documents in the valuecolumn using the following data structure to represent theGettysburg Address and the Declaration of Independence:

rowID

family

qual

time

value

gettysburg

speech

script

Four score and seven years ago …

declaration

document

script

When in the course of human events …

Write Map and Reduce pseudocode that determines documentsimilarity according to combinations of three important words.Important words are the words that are not in stopword lists. Stop words are words that are common to alldocuments that do not provide much indication of the topic of thedocument. Example stop words include the, and, to, but,because, an, a, …. Assume that you are given the followingarray of stop words to use:

private String[] stopWords = {“the”, “and”, “to”, “but”,“because”, “an”, “a”, …};

For all combinations of three important (non-stop word) words,create an Accumulo output table that clusters documents accordingto the important word triples. Two sample output rows from theMapReduce algorithm applied to the table above will look like:

rowID

family

qual

time

value

government:liberty:people

speech

script

gettysburg

government:liberty:people

document

script

declaration

Make sure that the important words used in the rowID are sortedin alphabetical order so you have only one“government:liberty:people” rowID value, not“government:people:liberty”, “liberty:government:people”,“liberty:people:government”, “people:liberty:government”,“people:government:liberty”. Also, do not worry about multipleoccurrences of any word. Even though the word government appearsmany times in the Declaration of Independence, just create oneoutput for the “government:liberty:people” triple.

  1. Map algorithm pseudocode
  1. Reduce algorithm pseudocode

Expert Answer


Answer to Suppose our Accumulo system stores entire documents in the value column using the following data structure to represent …

Leave a Comment

About

We are the best freelance writing portal. Looking for online writing, editing or proofreading jobs? We have plenty of writing assignments to handle.

Quick Links

Browse Solutions

Place Order

About Us

× How can I help you?