nbclouder package

Submodules

nbclouder.clouder module

class nbclouder.clouder.Clouder(naver_id: str, NID_AUT: Optional[str] = None, NID_SES: Optional[str] = None)

Bases: object

Get posts and make word cloud.

Parameters
  • naver_id – naver id of owner of posts. you should understand this is not your naver id.

  • NID_AUT – one of naver login cookie, you can find by browser cookies tab. this is necessary if you want to get private posts.

  • NID_SES – one of naver login cookie. this is similar to NID_AUT.

category_names()

Return all category names.

fire(category_names: List[str], image_path: str, font_path: str, pos_tagging_fn: Optional[Callable[[str], List[Tuple[str, str]]]] = None, datetime_filter_fn: Optional[Callable[[datetime.datetime], bool]] = None, white_tags: Iterable[str] = ('Noun', 'Verb', 'Adjective'), background_color: str = 'white', width: int = 800, height: int = 600, **kwargs)Tuple[wordcloud.wordcloud.WordCloud, Dict[str, int], List[str], List[str]]

Carry out all processes. parameters is same as other functions.

get_contents(post_ids: List[str], datetime_filter_fn: Optional[Callable[[datetime.datetime], bool]] = None)List[str]

Get content of posts.

Parameters
  • post_ids – a list of post id in Naver blog.

  • datetime_filter_fn – the function to filter post by datetime

return: a collection of contents of posts. if without_datetime is true, just return post content.

get_post_ids(category_names: List[str])List[str]

Get list of post ids.

Parameters

category_names – category names to get posts.

Returns

a list of pid (post id) of Naver blog.

get_word_frequency(contents: Union[List[Tuple[datetime.datetime, str]], List[str]], pos_tagging_fn: Optional[Callable[[str], List[Tuple[str, str]]]] = None, white_tags: Iterable[str] = ('Noun', 'Verb', 'Adjective'), preserve_tag: bool = False, **kwargs)Union[Dict[str, int], Dict[Tuple[str, str], int]]

Calculate Words frequency.

Parameters
  • contents – a output of get_contents method.

  • pos_tagging_fn – function to pos_tagging_fn tagging text.

  • white_tags – a collection of types of tags that should be counted.

  • preserve_tag – whether preserve and return tag information. if false, just return {word:count}

  • **kwargs – pos_tagging_fn tagger options.

Returns

morphs counts { morph: count}

make_cloud(image_path: str, word_frequency: Dict[str, int], font_path: str, background_color: str = 'white', width: int = 800, height: int = 600, **kwargs)wordcloud.wordcloud.WordCloud

Save wordcloud image file on ‘image_path’.

Parameters
  • image_path – path to save image.

  • word_frequency – key is word(morphs) and value is count.

  • font_path – font file path to draw word to image.

  • background_color – background color for word cloud image.

  • width – width of the canvas.

  • height – height of the canvas.

Returns

wordcloud object made using ‘word_frequency’

Module contents

class nbclouder.Clouder(naver_id: str, NID_AUT: Optional[str] = None, NID_SES: Optional[str] = None)

Bases: object

Get posts and make word cloud.

Parameters
  • naver_id – naver id of owner of posts. you should understand this is not your naver id.

  • NID_AUT – one of naver login cookie, you can find by browser cookies tab. this is necessary if you want to get private posts.

  • NID_SES – one of naver login cookie. this is similar to NID_AUT.

category_names()

Return all category names.

fire(category_names: List[str], image_path: str, font_path: str, pos_tagging_fn: Optional[Callable[[str], List[Tuple[str, str]]]] = None, datetime_filter_fn: Optional[Callable[[datetime.datetime], bool]] = None, white_tags: Iterable[str] = ('Noun', 'Verb', 'Adjective'), background_color: str = 'white', width: int = 800, height: int = 600, **kwargs)Tuple[wordcloud.wordcloud.WordCloud, Dict[str, int], List[str], List[str]]

Carry out all processes. parameters is same as other functions.

get_contents(post_ids: List[str], datetime_filter_fn: Optional[Callable[[datetime.datetime], bool]] = None)List[str]

Get content of posts.

Parameters
  • post_ids – a list of post id in Naver blog.

  • datetime_filter_fn – the function to filter post by datetime

return: a collection of contents of posts. if without_datetime is true, just return post content.

get_post_ids(category_names: List[str])List[str]

Get list of post ids.

Parameters

category_names – category names to get posts.

Returns

a list of pid (post id) of Naver blog.

get_word_frequency(contents: Union[List[Tuple[datetime.datetime, str]], List[str]], pos_tagging_fn: Optional[Callable[[str], List[Tuple[str, str]]]] = None, white_tags: Iterable[str] = ('Noun', 'Verb', 'Adjective'), preserve_tag: bool = False, **kwargs)Union[Dict[str, int], Dict[Tuple[str, str], int]]

Calculate Words frequency.

Parameters
  • contents – a output of get_contents method.

  • pos_tagging_fn – function to pos_tagging_fn tagging text.

  • white_tags – a collection of types of tags that should be counted.

  • preserve_tag – whether preserve and return tag information. if false, just return {word:count}

  • **kwargs – pos_tagging_fn tagger options.

Returns

morphs counts { morph: count}

make_cloud(image_path: str, word_frequency: Dict[str, int], font_path: str, background_color: str = 'white', width: int = 800, height: int = 600, **kwargs)wordcloud.wordcloud.WordCloud

Save wordcloud image file on ‘image_path’.

Parameters
  • image_path – path to save image.

  • word_frequency – key is word(morphs) and value is count.

  • font_path – font file path to draw word to image.

  • background_color – background color for word cloud image.

  • width – width of the canvas.

  • height – height of the canvas.

Returns

wordcloud object made using ‘word_frequency’