Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle nested divs in first_item_content #3348

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

CodeSonia
Copy link
Contributor

@CodeSonia CodeSonia commented Sep 27, 2024

What

Trello card

This PR addresses an issue where the content list was not displayed on some publication pages due to the way nested content elements were handled in the show_contents_list? method.

Context:

The show_contents_list? method uses various conditions to determine whether to display a content list. These rules were first introduced in PR #719.

Recently, a Zendesk ticket reported that the content list was missing from the DVSA Earned Recognition page page, resulting in an empty block of space on the left side of the publication.

The issue was traced to the method's ability to handle deeply nested content inside div elements.

Debugging:

During investigation, the following issues were identified on that page:

- first_item_content.length = 0 was inaccurately evaluated as 0
- first_item_has_long_content? = false
- first_item_content = "" was empty

However, this is not accurate, because the content includes multiple headings and subsections:

  • An h2 section for "Driver system providers"
  • Multiple h3 subsections like "A", "B", and "C", each containing p elements with contact details, all wrapped inside nested div tags.

The show_contents_list? method wasn't picking up this content due to the nested structure. As a result, the conditions returned false, causing the content list to be omitted.

Solution:

I updated the first_item_content method in contents_list.rb to handle nested divs and introduced other methods to extract valid elements p, ul, and ol inside a nested div element, ensuring that all content is considered.

Tests:

Several test cases have been added:

  1. Test case for long content in nested elements: It shows the content list when the first item contains long content inside nested div elements.
  2. Test case for short content in nested elements: Ensures that the content list is not shown when the first item contains short content, even if it is nested inside divs.
  3. Deep div nesting: This tests that the method correctly extracts text from deeply nested divs while ignoring allowed elements

Testing:

The page now correctly renders the content list:

  • first_item_content.length: 2817
  • first_item_has_long_content? true
  • first_item_content now captures the actual content.

Why

Without this fix, pages with only two content items and no recognised long content could fail to render the content list, leaving an empty whitespace that negatively impacts user experience.

Visual Changes

Before After
image image

⚠️ This repo is Continuously Deployed: make sure you follow the guidance ⚠️

Follow these steps if you are doing a Rails upgrade.

@govuk-ci govuk-ci temporarily deployed to government-frontend-pr-3348 September 27, 2024 14:10 Inactive
@CodeSonia CodeSonia marked this pull request as ready for review September 27, 2024 15:05
@CodeSonia CodeSonia force-pushed the show-content-list-with-nested-divs branch from e0ba271 to 07c7fd6 Compare September 27, 2024 15:18
@govuk-ci govuk-ci temporarily deployed to government-frontend-pr-3348 September 27, 2024 15:18 Inactive

until element.name == "h2"
first_item_text += element.text if element.name.in?(allowed_elements)
allowed_elements = %w[p ul ol div]
Copy link
Contributor

@unoduetre unoduetre Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

div is checked separately in line 53, so adding it here doesn't change anything and is confusing, so it can be removed from this line. So line 55 can be also changed to not exclude it.


until element.nil? || element.name == "h2"
if element.name == "div"
element.children.each do |child|
Copy link
Contributor

@unoduetre unoduetre Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first part of if only checks a single level of divs, so it won't find paragraphs inside nested divs e.g.:

<div>
  <div>
    <div>
      <p>
       text...
      </p>
    </div>
  </div>
</div>

It might solve the immediate problem with the particular document, but the problem will reappear when someone adds a document with more deeply nested divs.

If we want to find all nested paragraphs, a recursive solution would work, but is it what we want to do in the first place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that makes sense - I'll change this so I look for more deeply nested divs as like you said I only considered the first level 😅

@CodeSonia CodeSonia marked this pull request as draft September 27, 2024 15:33
@CodeSonia CodeSonia force-pushed the show-content-list-with-nested-divs branch from 07c7fd6 to 72e43f1 Compare October 8, 2024 09:49
@govuk-ci govuk-ci temporarily deployed to government-frontend-pr-3348 October 8, 2024 09:50 Inactive
Updated the first_item_content method so that is processes child elements within nested <div> tags. This ensures that valid content, such as <p>, <ul> and <ol> elements inside deeply nested <div> structures is included when evaluating the content length.

The method can now determine whether there is enough content to render a content list.

This addressed an issue where it failed to display a list due to missing content within nested <div> elements.
end
end

def extract_nested_content(element)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is some nice fancy code!! I think we should check that this is behaviour we actually want though first. I'm wondering if it might result in longer than expected contents lists as we nibble through every element on the page. Do you happen to have an example of a page with very nested content so we can look at the before an after?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants