nbclouder package¶

Submodules¶

nbclouder.clouder module¶

class nbclouder.clouder.Clouder(naver_id: str, NID_AUT: Optional[str] = None, NID_SES: Optional[str] = None)¶

Bases: object

Get posts and make word cloud.

Parameters

naver_id – naver id of owner of posts. you should understand this is not your naver id.
NID_AUT – one of naver login cookie, you can find by browser cookies tab. this is necessary if you want to get private posts.
NID_SES – one of naver login cookie. this is similar to NID_AUT.

category_names()¶: Return all category names.

fire(category_names: List[str], image_path: str, font_path: str, pos_tagging_fn: Optional[Callable[[str], List[Tuple[str, str]]]] = None, datetime_filter_fn: Optional[Callable[[datetime.datetime], bool]] = None, white_tags: Iterable[str] = ('Noun', 'Verb', 'Adjective'), background_color: str = 'white', width: int = 800, height: int = 600, **kwargs) → Tuple[wordcloud.wordcloud.WordCloud, Dict[str, int], List[str], List[str]]¶: Carry out all processes. parameters is same as other functions.

get_contents(post_ids: List[str], datetime_filter_fn: Optional[Callable[[datetime.datetime], bool]] = None) → List[str]¶

Get content of posts.

Parameters

post_ids – a list of post id in Naver blog.
datetime_filter_fn – the function to filter post by datetime

return: a collection of contents of posts. if without_datetime is true, just return post content.

get_post_ids(category_names: List[str]) → List[str]¶

Get list of post ids.

Parameters: category_names – category names to get posts.
Returns: a list of pid (post id) of Naver blog.

get_word_frequency(contents: Union[List[Tuple[datetime.datetime, str]], List[str]], pos_tagging_fn: Optional[Callable[[str], List[Tuple[str, str]]]] = None, white_tags: Iterable[str] = ('Noun', 'Verb', 'Adjective'), preserve_tag: bool = False, **kwargs) → Union[Dict[str, int], Dict[Tuple[str, str], int]]¶

Calculate Words frequency.

Parameters

contents – a output of get_contents method.
pos_tagging_fn – function to pos_tagging_fn tagging text.
white_tags – a collection of types of tags that should be counted.
preserve_tag – whether preserve and return tag information. if false, just return {word:count}
**kwargs – pos_tagging_fn tagger options.

Returns

morphs counts { morph: count}

make_cloud(image_path: str, word_frequency: Dict[str, int], font_path: str, background_color: str = 'white', width: int = 800, height: int = 600, **kwargs) → wordcloud.wordcloud.WordCloud¶

Save wordcloud image file on ‘image_path’.

Parameters

image_path – path to save image.
word_frequency – key is word(morphs) and value is count.
font_path – font file path to draw word to image.
background_color – background color for word cloud image.
width – width of the canvas.
height – height of the canvas.

Returns

wordcloud object made using ‘word_frequency’

Module contents¶

class nbclouder.Clouder(naver_id: str, NID_AUT: Optional[str] = None, NID_SES: Optional[str] = None)¶

Bases: object

Get posts and make word cloud.

Parameters

naver_id – naver id of owner of posts. you should understand this is not your naver id.
NID_AUT – one of naver login cookie, you can find by browser cookies tab. this is necessary if you want to get private posts.
NID_SES – one of naver login cookie. this is similar to NID_AUT.

category_names()¶: Return all category names.

fire(category_names: List[str], image_path: str, font_path: str, pos_tagging_fn: Optional[Callable[[str], List[Tuple[str, str]]]] = None, datetime_filter_fn: Optional[Callable[[datetime.datetime], bool]] = None, white_tags: Iterable[str] = ('Noun', 'Verb', 'Adjective'), background_color: str = 'white', width: int = 800, height: int = 600, **kwargs) → Tuple[wordcloud.wordcloud.WordCloud, Dict[str, int], List[str], List[str]]¶: Carry out all processes. parameters is same as other functions.

get_contents(post_ids: List[str], datetime_filter_fn: Optional[Callable[[datetime.datetime], bool]] = None) → List[str]¶

Get content of posts.

Parameters

post_ids – a list of post id in Naver blog.
datetime_filter_fn – the function to filter post by datetime

return: a collection of contents of posts. if without_datetime is true, just return post content.

get_post_ids(category_names: List[str]) → List[str]¶

Get list of post ids.

Parameters: category_names – category names to get posts.
Returns: a list of pid (post id) of Naver blog.

Calculate Words frequency.

Parameters

contents – a output of get_contents method.
pos_tagging_fn – function to pos_tagging_fn tagging text.
white_tags – a collection of types of tags that should be counted.
preserve_tag – whether preserve and return tag information. if false, just return {word:count}
**kwargs – pos_tagging_fn tagger options.

Returns

morphs counts { morph: count}

Save wordcloud image file on ‘image_path’.

Parameters

image_path – path to save image.
word_frequency – key is word(morphs) and value is count.
font_path – font file path to draw word to image.
background_color – background color for word cloud image.
width – width of the canvas.
height – height of the canvas.

Returns

wordcloud object made using ‘word_frequency’