Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random failure while removing cache entry #189

Open
yotamgi opened this issue Sep 14, 2017 · 0 comments
Open

Random failure while removing cache entry #189

yotamgi opened this issue Sep 14, 2017 · 0 comments

Comments

@yotamgi
Copy link
Contributor

yotamgi commented Sep 14, 2017

When running the whole 40 switchdev recipes in a row, we get this exception on one of the slaves. From that point on, all other tests fail with the same repository.

The exception happens pretty much every time we run the whole switchdev recipes, but every run it happens on a different recipe.

There is one piece of information that can be useful: It started happening right after I upgraded one of the slave machines in the setup to Fedora 26. The exception was never seen on that slave though.

The slave log around the first occurrence of the exception is:

2017-09-14 05:09:42       (localhost)        -   DEBUG: Executing: "ip l show ens8d1"
2017-09-14 05:09:42       (localhost)        -   DEBUG: Executing: "tc qdisc del dev ens8d1 root"
2017-09-14 05:09:42       (localhost)        -   DEBUG:
    Stderr:
    ----------------------------
    RTNETLINK answers: No such file or directory
    ----------------------------
2017-09-14 05:09:42       (localhost)        -   DEBUG: Executing: "tc filter show dev ens8d1"
2017-09-14 05:09:42       (localhost)        -   DEBUG: Executing: "tc qdisc show dev ens8d1"
2017-09-14 05:09:42       (localhost)        -   DEBUG:
    Stdout:
    ----------------------------
    qdisc mq 0: root
    qdisc fq_codel 0: parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :3 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :4 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :5 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :6 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :7 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :8 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :9 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :a limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :b limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :c limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :d limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :e limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :f limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :10 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :11 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :12 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :13 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :14 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :15 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :16 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :17 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :18 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :19 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :1a limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :1b limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :1c limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :1d limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :1e limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :1f limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    qdisc fq_codel 0: parent :20 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
    ----------------------------
2017-09-14 05:09:42       (localhost)        -   DEBUG: Executing: "ip l show ens8d1"
2017-09-14 05:09:42       (localhost)        -   DEBUG: Executing: "tc filter show dev ens8d1"
2017-09-14 05:09:42       (localhost)        -   DEBUG: Network namespace nsif removed.
2017-09-14 05:09:42       (localhost)        -    INFO: Restoring system configuration
2017-09-14 05:09:42       (localhost)        -   DEBUG:
    Traceback (most recent call last):
      File "/usr/lib/python2.7/site-packages/lnst/Slave/NetTestSlave.py", line 1240, in _process_msg
        result = method(*msg["args"])
      File "/usr/lib/python2.7/site-packages/lnst/Slave/NetTestSlave.py", line 109, in bye
        self._cache.del_old_entries()
      File "/usr/lib/python2.7/site-packages/lnst/Common/ResourceCache.py", line 145, in del_old_entries
        self.del_cache_entry(entry_hash)
      File "/usr/lib/python2.7/site-packages/lnst/Common/ResourceCache.py", line 130, in del_cache_entry
        shutil.rmtree("%s/%s" % (self._root, entry_hash))
      File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree
        onerror(os.listdir, path, sys.exc_info())
      File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree
        names = os.listdir(path)
    OSError: [Errno 2] No such file or directory: '/var/cache/lnst/076acc59a4677a40b25986fe0484b118'

2017-09-14 05:09:42       (localhost)        -    INFO: Lost controller connection.
2017-09-14 05:09:42       (localhost)        -    INFO: Waiting for connection.
2017-09-14 05:09:46       (localhost)        -    INFO: Recieved connection from 10.137.169.5
2017-09-14 05:09:46       (localhost)        -    INFO: Waiting for connection.
2017-09-14 05:11:05       (localhost)        -    INFO: Recieved connection from 10.137.169.5
yotamgi added a commit to yotamgi/lnst that referenced this issue Sep 14, 2017
Workaround the issue of non existing cache entry when trying to delete.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
yuvalmin pushed a commit to yuvalmin/lnst that referenced this issue Oct 29, 2017
Workaround the issue of non existing cache entry when trying to delete.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
yuvalmin pushed a commit to yuvalmin/lnst that referenced this issue Oct 30, 2017
Workaround the issue of non existing cache entry when trying to delete.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
nogahf pushed a commit to nogahf/lnst that referenced this issue Apr 15, 2018
Workaround the issue of non existing cache entry when trying to delete.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant