The curl_multi can break down nextcloud or the db connections #65

rabser · 2022-02-07T19:17:23Z

I experienced bad runs (many users empty lines) and a general DOS on nextcloud server when using this wonderful tool (480 users registered), as curl_multi is too aggressive.
I solved it by splitting the array of users in chunks simply wrapping the “curl_multi” user data attributes using array_chunk on the users list read from $SESSION.
Chunking in blocks of 50 users I managed to get all the info about them without loosing nextcloud or the backend database…

If my patch can be of any interest, i can share.

bpcurse · 2022-02-07T22:42:27Z

@rabser Great, I'm happy to hear you like it :)

Sharing the patch would be very appreciated, indeed.
This also sounds like a plausible explanation for another issue.

rabser · 2022-02-07T23:18:58Z

I'm sorry but I never understood how to use the pull requests (my fault), so I paste here the patch, as it's very simple:

In functions.php substitute starting from line 314

  // Initialize cURL multi handle for parallel requests
  $mh = curl_multi_init();

  // Iterate through userlist
  foreach($_SESSION['userlist'] as $key => $user_id) {

with this block

  $userids = array_chunk($_SESSION['userlist'],50);
  $chunks = count($userids);

  for ( $chunk=0; $chunk<$chunks; $chunk++) {
        // just to let you know what's happening...
        error_log("Working on chunk $chunk"); 

        // Initialize cURL multi handle for parallel requests
        $mh = curl_multi_init();
        // Iterate through userlist
        foreach($userids[$chunk] as $key => $user_id) {

then add a close brace after

curl_multi_close($mh);

to

curl_multi_close($mh);
}

which in original file is in line 350.

Obviously a better work should having a config item for chunking size, for a simpler tuning.
A chunk-size of 1, would be equal to not having curl_multi at all ...

HTH

bpcurse · 2022-02-09T18:20:25Z

Obviously a better work should having a config item for chunking size, for a simpler tuning.
A chunk-size of 1, would be equal to not having curl_multi at all ...

I had the same thought. A config option will be made available in the next version.

Thank you for contributing to this project 👍, no need to apologize for the pull request.
I had problems getting started with pull requests myself.

Just in case you would like to practice (as it is quite easy in Githubs Web UI):

Open the Code tab
Click on functions.php
Click the Edit button (a fork of the repository will be created automatically in your account)
Alter the code
Leave a short note about your changes and propose the added/changed code by clicking the corresponding button
That should be all AFAIK

rabser · 2022-02-09T19:22:28Z

Great! i didn’t know that i could simply edit directly a file to get a pull request (I manage several gitlab self hosted servers, so that i could edit a file was known to me).
Next time iI’ll try !

bpcurse · 2022-02-11T17:18:56Z

@rabser A new version containing your chunking proposal and a debug log (both are options in config.php) has been published in the debug branch. May I ask you to test this version as your setup is known to react to these changes?

Here is a howto:

Backup your functions.php, config.php and the l10n folder
After switching to the 'debug' branch download the above mentioned files and the folder and overwrite your existing ones
(regarding config.php you can just add the two new options to your existing file instead of overwriting it)
Set user_chunk_size and debug_log options as necessary

rabser · 2022-02-11T22:38:20Z

Hi,
tests were successful. Starting with chunks to false, I got many "!ERROR: Received empty user id" lines in the debug.log
Raising up (eg. 40, 50) the errors gone away, as it should be.

Just an hints about the debug.log: if you think to bring that feature to production, you should think that you should left to the admin (eg. in config.php) the choice on were debug.log will be saved or web dir should be writable by the webserver and this is normally not a best practice (nor accepted by providers).

HTH

bpcurse · 2022-02-12T18:27:05Z

Thanks a lot for testing! Glad to hear it works in "real life".

For production I am planning to show an error message after user data has been fetched and at least one transfer has failed, including a hint to alter the user_chunk_size config setting. It won't keep the user from working with the potentially incomplete data, if necessary.

This error message could offer a "download log file" button. Log data would be kept in a $_SESSION var until download is actively chosen.
I hope the use of the same system as user data csv file downloads with random temp path/file names and auto deletion after download should be secure enough? See build_csv_file, download_fileand delete_temp_folder functions in functions.php.

What do you think?

rabser · 2022-02-14T09:02:42Z

I'm not so used to save data into the $_SESSION, but yes, I can't see any problem in this strategy.

bpcurse added bug Something isn't working needs info Some information is missing labels Feb 7, 2022

bpcurse added 1. to develop Will be developed enhancement New feature or request, approved and removed needs info Some information is missing labels Feb 9, 2022

This was referenced Feb 9, 2022

Lots of empty rows #61

Open

wrong users list and empty export csv file #64

Open

bpcurse added 2. developing and removed 1. to develop Will be developed labels Feb 9, 2022

bpcurse self-assigned this Feb 9, 2022

bpcurse added 3. completed (being tested) Development is done. Will be merged. and removed 2. developing labels Feb 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The curl_multi can break down nextcloud or the db connections #65

The curl_multi can break down nextcloud or the db connections #65

rabser commented Feb 7, 2022

bpcurse commented Feb 7, 2022

rabser commented Feb 7, 2022

bpcurse commented Feb 9, 2022

rabser commented Feb 9, 2022

bpcurse commented Feb 11, 2022

rabser commented Feb 11, 2022 •

edited

Loading

bpcurse commented Feb 12, 2022

rabser commented Feb 14, 2022

The curl_multi can break down nextcloud or the db connections #65

The curl_multi can break down nextcloud or the db connections #65

Comments

rabser commented Feb 7, 2022

bpcurse commented Feb 7, 2022

rabser commented Feb 7, 2022

bpcurse commented Feb 9, 2022

rabser commented Feb 9, 2022

bpcurse commented Feb 11, 2022

rabser commented Feb 11, 2022 • edited Loading

bpcurse commented Feb 12, 2022

rabser commented Feb 14, 2022

rabser commented Feb 11, 2022 •

edited

Loading