Elegant way to deliver sitemap and robots.txt

Hello,

I am facing the challenge to SEO my page. Therefore I want the support a robots.txt and a sitemap. But I am not sure about the best way to solve the issue. Has anybody solved the challange or a good idea how to realize.

Thanks for your ideas!

Hello,

you can add a servlet that will have mapping “robots.txt”. However, servlet uses the application context path and the servlet will work on the following path:

http://localhost:8080/app/robots.txt

You should deploy your application as ROOT in your web container to achieve it without a context path.

Hello, the mentioned solution does not work for single war deployed applications. In the default deployment it get it solved by doing:

modify web.xml to load own servlet and provide a mapping

<servlet>
    <servlet-name>bot-servlet</servlet-name>
    <servlet-class>com.mc.iauction.web.servlet.BotServlet</servlet-class>
    <init-param>
        <param-name>file</param-name>
        <param-value>robots.txt</param-value>
    </init-param>
    <load-on-startup>1</load-on-startup>
    <async-supported>true</async-supported>
</servlet>

<servlet-mapping>
    <servlet-name>bot-servlet</servlet-name>
    <url-pattern>/robots.txt</url-pattern>
</servlet-mapping>

Change the filter servlet configuration to

cuba.web.cubaHttpFilterBypassUrls=/ws/,/dispatch/,/front/,/robots,/sitemap

Because cuba HTTPFilterServlet adds a ‘/’ to the end of the URIs, I had to overwrite the filter and remove the section:

@Override
public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse, FilterChain chain)
        throws IOException, ServletException {

    HttpServletRequest request = (HttpServletRequest) servletRequest;
    HttpServletResponse response = (HttpServletResponse) servletResponse;

    request.setCharacterEncoding(StandardCharsets.UTF_8.name());

    String requestURI = request.getRequestURI();
    /** REMOVED from CUBA Implementation
    if (!requestURI.endsWith("/")) {
        requestURI = requestURI + "/";
    }
     */
    for (String bypassUrl : bypassUrls) {
        if (requestURI.contains(bypassUrl)) {
            log.trace("Skip URL check: '{}' contains '{}'", requestURI, bypassUrl);
            chain.doFilter(servletRequest, servletResponse);
            return;
        }
    }

    super.doFilter(request, response, chain);
}

Then my servlet delivers the response by

public class BotServlet extends HttpServlet {

private Log log = LogFactory.getLog(this.getClass());
private String file;

public BotServlet()  {
    super();
    log.info("Bot-Servlet loaded");
}

@Override
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
    file = getInitParameter("file");
    if( file == null ) {
        log.error("Bot-Servlet: File not found! " + file);
        return;
    }

    processRequest(request, response);
}

protected void processRequest(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {

    log.info("Request: " + request.getRequestURI());

    try {
        InputStream inputStream = this.getClass().getResourceAsStream(file);
        response.getOutputStream().write(inputStream.readAllBytes());

        response.setContentType("text/plain;charset=UTF-8");
        response.setStatus(HttpServletResponse.SC_OK);
        return;
    } catch (Exception ex )
    {
        log.error("", ex);
    }
    response.setStatus(HttpServletResponse.SC_INTERNAL_SERVER_ERROR);
}

}

But in single-WAR deployment, the configuration (filter servlet and my BotServlet are not loaded! May be you can give me a hint how to get my servlet and configuration loaed.

Solution for Single-War deployment: Follow the instructions unter https://doc.cuba-platform.com/manual-7.2/main_servlets_registration.html and register the servlet and the filter by code.