[FR] A richer `query` file #234

chainsawriot · 2021-08-25T11:30:52Z

Describe the solution you'd like

The current query file has only three lines and they represent the query, start_tweets and end_tweets parameters of get_all_tweets.

While incredibly useful, it can be richer. For example, in #217 , parameters such as n, n_page etc are not retained and it can create some confusion.

A better solution is to save all parameters of get_tweet into a file but doesn't hardcode bearer_token. As the package is depending on jsonlite anyway, maybe we can consider storing it as a json file.

Anything else?

No response

The text was updated successfully, but these errors were encountered:

cjbarrie · 2021-08-25T12:05:03Z

Agreed this could be richer and we should record the specified n. Do you think page_n as a parameter is useful to record? To me, this seems like a design feature of the Twitter API that is pretty confusing. Could create further confusion?

chainsawriot · 2021-08-25T12:47:24Z

@cjbarrie I would like to make a correction that page_n will be in the params anyway by get_tweet. If we are going to store params, page_n is automatically recorded.

cjbarrie · 2021-08-25T12:54:56Z

Oh I see what you mean. In that case, yes, if we could pass all params (minus bearer_token as you say) then that would make sense.

chainsawriot · 2021-08-25T16:41:00Z

It could be something like this

require(academictwitteR)
#> Loading required package: academictwitteR
params <- 
  list(query = "from:Peter_Tolochko -is:retweet", max_results = 15, 
       start_time = "2020-02-03T00:00:00Z", end_time = "2020-11-03T00:00:00Z", 
       tweet.fields = "attachments,author_id,conversation_id,created_at,entities,geo,id,in_reply_to_user_id,lang,public_metrics,possibly_sensitive,referenced_tweets,source,text,withheld", 
       user.fields = "created_at,description,entities,id,location,name,pinned_tweet_id,profile_image_url,protected,public_metrics,url,username,verified,withheld", 
       expansions = "author_id,entities.mentions.username,geo.place_id,in_reply_to_user_id,referenced_tweets.id,referenced_tweets.id.author_id", 
       place.fields = "contained_within,country,country_code,full_name,geo,id,name,place_type")

endpoint_url <- "https://api.twitter.com/2/tweets/search/all"
n <- 100
bind_tweets <- FALSE
verbose <- TRUE
data_path <- academictwitteR:::.gen_random_dir()
file <- NULL

get_tweet_input <- list(params = params, endpoint_url = endpoint_url, n = n, file = file,
                        bind_tweets = bind_tweets, verbose = verbose,
                        data_path = data_path)

exported_input <- jsonlite::toJSON(get_tweet_input, null = "null")
exported_input
#> {"params":{"query":["from:Peter_Tolochko -is:retweet"],"max_results":[15],"start_time":["2020-02-03T00:00:00Z"],"end_time":["2020-11-03T00:00:00Z"],"tweet.fields":["attachments,author_id,conversation_id,created_at,entities,geo,id,in_reply_to_user_id,lang,public_metrics,possibly_sensitive,referenced_tweets,source,text,withheld"],"user.fields":["created_at,description,entities,id,location,name,pinned_tweet_id,profile_image_url,protected,public_metrics,url,username,verified,withheld"],"expansions":["author_id,entities.mentions.username,geo.place_id,in_reply_to_user_id,referenced_tweets.id,referenced_tweets.id.author_id"],"place.fields":["contained_within,country,country_code,full_name,geo,id,name,place_type"]},"endpoint_url":["https://api.twitter.com/2/tweets/search/all"],"n":[100],"file":null,"bind_tweets":[false],"verbose":[true],"data_path":["/tmp/RtmpGa8YoH/hudmlneaxtypzjowsvcg"]}

parsed_input <- jsonlite::fromJSON(exported_input)

academictwitteR:::make_query(url = parsed_input$endpoint_url,
                             params = parsed_input$params, bearer_token = get_bearer())
#> $meta
#> $meta$result_count
#> [1] 0


## data_path shouldn't be from parsed_input; although it is available.
academictwitteR:::get_tweets(params = parsed_input$params, endpoint_url = parsed_input$endpoint_url, n = parsed_input$n, file = parsed_input$file,
                             bearer_token = get_bearer(), data_path = data_path, export_query = FALSE, bind_tweets = parsed_input$bind_tweets,
                             verbose = parsed_input$verbose)
#> Total pages queried: 1 (tweets captured this page: 0).
#> This is the last page for from:Peter_Tolochko -is:retweet : finishing collection.
#> Data stored as JSONs: use bind_tweets function to bundle into data.frame

^{Created on 2021-08-25 by the reprex package (v2.0.0)}

chainsawriot mentioned this issue Apr 14, 2022

[FR] Next generation archiving architecture #308

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FR] A richer `query` file #234

[FR] A richer `query` file #234

chainsawriot commented Aug 25, 2021

cjbarrie commented Aug 25, 2021

chainsawriot commented Aug 25, 2021

cjbarrie commented Aug 25, 2021

chainsawriot commented Aug 25, 2021

[FR] A richer query file #234

[FR] A richer query file #234

Comments

chainsawriot commented Aug 25, 2021

Describe the solution you'd like

Anything else?

cjbarrie commented Aug 25, 2021

chainsawriot commented Aug 25, 2021

cjbarrie commented Aug 25, 2021

chainsawriot commented Aug 25, 2021

[FR] A richer `query` file #234

[FR] A richer `query` file #234