(LINUX) LINUX (2021)

How to move static html site from Windows to Linux

This site since creation was be hosted in Windows, because Window was be full free for hosting (Microsoft was send to anybody Windows key for MSDN subscription), but now Microsoft's degenerates has changed their policies and decide leave servers - move ASP.NET to Linux platform and decide to stop sending free key through MSDN. Therefore I decide follow this rules and also leave Windows in server.

For 20+ years my site become a huge, more than 120,000 pages and images.



So, this is list of steps I performed to do this task.

1. Increase disk space.

Now my site need about 100GB disk size, so fist step was to add additional space.


2. Transform database from MsSQL to MySQL.

To be honest this is not a static HTML-site, it has a lot of active ASP.NET extension, so first my step was be transform database from MsSQL to MySQL.



The simplest and fastest way to do this operation manually.

Firstly I have created DB inside MySQL, it's require only two commands - create database and user (grant permission):



This is more detail instruction Setup MariaDB on Ubuntu server (remote access, user privileges, upload database).


Than I have created DB structure I need - I have downloaded MS DB structure and convert SQL by http://www.sqlines.com/online.



Than I have copy data to Notepad++, create needed SQL Insert command and perform this command by MySQL.



Main point is turn off Autoincrement during data add processing.

SET FOREIGN_KEY_CHECKS=0;
alter table `Forum` modify column `i` int(11) not null AUTO_INCREMENT;
SET FOREIGN_KEY_CHECKS=1;

Next step is converting StoredProcedures. Unfortunately its possible only manually with a lot of attention for each procedures.



3. Add Mime types.

I have analyze my addition to mime.types as add it to /etc/nginx/mime.types.

4. Add basic authentication.

My site absolutely free, but has a couple folders with interesting book only for me. This books is not a huge secret, this is ordinary computer books and you can download it in thousands site from Inet, but I decide to protect it by password in my site.



So, this step is add basic AU to some folders. This is simple operations, in my case:


# sudo apt-get install apache2-utils
# sudo htpasswd -c /var/www/vb-net.com/html/AndroidBook/Doc/.htpasswd 1
# cat /var/www/vb-net.com/html/AndroidBook/Doc/.htpasswd

Than I have add restriction to NGINX rules, in my case:

	location /AndroidBook/Doc/ {
        try_files $uri $uri/ =404;
        auth_basic "Restricted Content";
        auth_basic_user_file /AndroidBook/Doc/.htpasswd;
    }

And restart NGINX.


# sudo service nginx restart


So, this step is pass too, this is my result config, and we will going ahead to main step.

5. Change file names.

Of course, Windows file names is case insensitive and Windows site can not be working in Linux without tricks. Various recipes, for example Using Apache htaccess file to change URL to lowercase, Convert and redirect URL in uppercase to lowercase using .htaccess, is not working in this case, because this redirect mean that you know filenames on disk. For example folder in disk has name LowCostAspNet, but inside page I use link to this page as LOWCOSTASPNET or lowcostaspnet. In this case to find out existing file in disk need 2 147 483 648 redirect with 32 chars in URL and 18 446 744 073 709 551 616 redirect with 64 letter in name. So, we need workable solution, I it describe below.


Continue reading - Linux console app (.NET Core, EF DB first, CamelCase file and dir rename, calc MD5, RegExpression, change and check link).

6. Change CMS.

Of course, my site has my own CMS to publish page, sync local Windows folder and remote, automatically create list of articles //www.vb-net.com/Articles/index.htm, automatically create forum topic for user comments for each page, automatically create RSS fields, create advertising (on top of pages you can see ticker to related topics) and many other futures. Of course, I don't manually add page to each this list, my CMS doing all needed operation automatically. And my CMS need to change too.


7. Recursion and memory optimization.


Most unexpected and interesting step was be memory optimization. I start this program in huge Linux machine with 50GB memory, therefore I don't thinking about memory at all, I thinking only my own time I mask spend to programming.



But I found additional time for optimization and during a couple of minutes I reduced consumption memory from 1300 MB with objects 200-300 MB.



to 75 MB with objects 0,1 MB (about 20 times !!!)



All I need to so radically reducing memory - I deleted recursion. I have replaces this code.


   1:  Imports System.Text.RegularExpressions
   2:   
   3:  Partial Module Program
   4:   
   5:      Public Enum LinkType
   6:          Href = 1
   7:          Src = 2
   8:   
   9:      End Enum
  10:   
  11:      Sub ParseOneFile(FileName As String, ByRef HTML As String)
  12:          Dim HrefRegex = New Regex("<a\s.*?href=(?:'|"")([^'"">]+)(?:'|"")", RegexOptions.Compiled Or RegexOptions.IgnoreCase)
  13:          RecursiveProcessingOneInternalLink(FileName, HTML, HrefRegex, LinkType.Href, 0)
  14:          Dim LocationRegex = New Regex("location.href=(?:'|"")([^'"">]+)(?:'|"")", RegexOptions.Compiled Or RegexOptions.IgnoreCase)
  15:          RecursiveProcessingOneInternalLink(FileName, HTML, LocationRegex, LinkType.Href, 0)
  16:          Dim SrcRegex = New Regex("<img\s.*?src=(?:'|"")([^'"">]+)(?:'|"")", RegexOptions.Compiled Or RegexOptions.IgnoreCase)
  17:          RecursiveProcessingOneInternalLink(FileName, HTML, SrcRegex, LinkType.Src, 0)
  18:          Dim LinkRegex = New Regex("<link\s.*?href=(?:'|"")([^'"">]+)(?:'|"")", RegexOptions.Compiled Or RegexOptions.IgnoreCase)
  19:          RecursiveProcessingOneInternalLink(FileName, HTML, LinkRegex, LinkType.Href, 0)
  20:          Dim ScriptRegex = New Regex("<script\s.*?src=(?:'|"")([^'"">]+)(?:'|"")", RegexOptions.Compiled Or RegexOptions.IgnoreCase)
  21:          RecursiveProcessingOneInternalLink(FileName, HTML, ScriptRegex, LinkType.Src, 0)
  22:      End Sub
  23:   
  24:      Sub RecursiveProcessingOneInternalLink(FileName As String, ByRef HTML As String, Regex As Regex, Type As LinkType, StartIndex As Integer)
  25:          Dim Links As MatchCollection = Regex.Matches(HTML)
  26:          If StartIndex <= Links.Count - 1 Then
  27:              If Links(StartIndex).Value.ToLower.Contains("vb-net.com") And Not Links(StartIndex).Value.ToLower.Contains("forum.vb-net.com") And Not Links(StartIndex).Value.ToLower.Contains("products.vb-net.com") And Not Links(StartIndex).Value.ToLower.Contains("bug.vb-net.com") And Not Links(StartIndex).Value.ToLower.Contains("freeware.vb-net.com") Or
  28:                  Not Links(StartIndex).Value.ToLower.Contains("http") And Not Links(StartIndex).Value.ToLower.Contains("href=""#""") And Not Links(StartIndex).Value.ToLower.Contains("vb-net") And Not Links(StartIndex).Value.ToLower.Contains("forum.vb-net.com") Then
  29:                  'processing internal link
  30:                  'Debug.Print($"{StartIndex}: {Links(StartIndex).Index}:{Links(StartIndex).Value}")
  31:                  ReplaceOneLink(FileName, HTML, Links(StartIndex).Index, Links(StartIndex).Value, Type)
  32:                  RecursiveProcessingOneInternalLink(FileName, HTML, Regex, Type, StartIndex + 1)
  33:              Else
  34:                  'look to next link
  35:                  RecursiveProcessingOneInternalLink(FileName, HTML, Regex, Type, StartIndex + 1)
  36:              End If
  37:          End If
  38:      End Sub
  39:   
  40:      Sub ReplaceOneLink(FileName As String, ByRef HTML As String, LinkPosition As Integer, LinkText As String, Type As LinkType)
  41:          Dim Str1 As New Text.StringBuilder()
  ...   
  93:          Str1.Append(Mid(HTML, LinkPosition + Len(LinkText)))    'add right HTML part outside of link
  94:          HTML = Str1.ToString
  95:      End Sub
  ...   

To this my code https://github.com/Alex-1347/WindowsServiceExample/blob/main/CacheBuilder/Parse.vb.


   1:  Public Module Proxy
   2:   
   3:      Public Enum LinkType
   4:          Href = 1
   5:          Src = 2
   6:      End Enum
   7:   
   8:      Sub ProcessingHTML(ByRef HTML As String)
   9:          Dim HrefRegex As Regex = New Regex("<a\s.*?href=(?:'|"")([^'"">]+)(?:'|"")", RegexOptions.Compiled Or RegexOptions.IgnoreCase)
  10:          ProcessingLinks(HTML, HrefRegex, LinkType.Href)
  11:          HrefRegex = Nothing
  12:          Dim LocationRegex As Regex = New Regex("location.href=(?:'|"")([^'"">]+)(?:'|"")", RegexOptions.Compiled Or RegexOptions.IgnoreCase)
  13:          ProcessingLinks(HTML, LocationRegex, LinkType.Href)
  14:          LocationRegex = Nothing
  15:          Dim SrcRegex As Regex = New Regex("<img\s.*?src=(?:'|"")([^'"">]+)(?:'|"")", RegexOptions.Compiled Or RegexOptions.IgnoreCase)
  16:          ProcessingLinks(HTML, SrcRegex, LinkType.Src)
  17:          SrcRegex = Nothing
  18:          Dim LinkRegex As Regex = New Regex("<link\s.*?href=(?:'|"")([^'"">]+)(?:'|"")", RegexOptions.Compiled Or RegexOptions.IgnoreCase)
  19:          ProcessingLinks(HTML, LinkRegex, LinkType.Href)
  20:          LinkRegex = Nothing
  21:          Dim ScriptRegex As Regex = New Regex("<script\s.*?src=(?:'|"")([^'"">]+)(?:'|"")", RegexOptions.Compiled Or RegexOptions.IgnoreCase)
  22:          ProcessingLinks(HTML, ScriptRegex, LinkType.Src)
  23:          ScriptRegex = Nothing
  24:      End Sub
  25:   
  26:      Sub ProcessingLinks(ByRef HTML As String, Regex As Regex, Type As LinkType)
  27:          Dim Links As MatchCollection = Regex.Matches(HTML)
  28:          Dim I As Integer = 0
  29:          While Links.Count > 0
  30:              If Not Links(I).Value.ToLower.Contains("//") Then
  31:                  ReplaceOneRelativeLink(HTML, Links(I).Index, Links(I).Value, Type)
  32:                  Links = Regex.Matches(HTML)
  33:              End If
  34:              If I < Links.Count - 1 Then
  35:                  I += 1
  36:              Else
  37:                  Exit While
  38:              End If
  39:   
  40:          End While
  41:          Links = Nothing
  42:      End Sub
  43:   
  44:      Sub ReplaceOneRelativeLink(ByRef HTML As String, LinkPosition As Integer, LinkText As String, Type As LinkType)
  45:          Dim Str1 As New Text.StringBuilder()
  46:          Str1.Append(Left(HTML, LinkPosition))               'add left HTML part outside of link
  47:          Dim Pos1 As Integer
  48:          Select Case Type
  49:              Case Type.Href
  50:                  Pos1 = InStr(LinkText.ToLower, "href=", CompareMethod.Text)
  51:              Case Type.Src
  52:                  Pos1 = InStr(LinkText.ToLower, "src=", CompareMethod.Text)
  53:          End Select
  54:          If Pos1 > 0 Then
  55:              Dim Pos2 = InStr(Pos1 + 1, LinkText.ToLower, """", CompareMethod.Text)
  56:              If Pos2 <= 0 Then
  57:                  Pos2 = InStr(Pos1 + 1, LinkText.ToLower, "'", CompareMethod.Text)
  58:              End If
  59:              If Pos2 <= 0 Then
  60:                  Debug.Print("Link start not found :" & LinkText)
  61:              Else
  62:                  Dim Pos3 As Integer = InStr(Pos2 + 1, LinkText.ToLower, """", CompareMethod.Text)
  63:                  If Pos3 <= 0 Then
  64:                      Pos3 = InStr(Pos2 + 1, LinkText.ToLower, "'", CompareMethod.Text)
  65:                  End If
  66:                  If Pos3 <= 0 Then
  67:                      Debug.Print("Link end not found :" & LinkText)
  68:                  Else
  69:                      Dim ClearSiteLink As String = Mid(LinkText, Pos2 + 1, Pos3 - Pos2 - 1)
  70:                      Str1.Append(Left(LinkText, Pos2))           'add left part of link
  71:                      If ClearSiteLink.StartsWith("/") Then
  72:                          Str1.Append(PromocodeCacheCreater.TargetServerRoot)
  73:                          Str1.Append(ClearSiteLink)
  74:                      Else ' link starts with other chars - #, Index.htm, ../
  75:                          Str1.Append(PromocodeCacheCreater.TargetServerPath)
  76:                          Str1.Append("/")
  77:                          Str1.Append(ClearSiteLink)
  78:                      End If
  79:                      If PromocodeCacheCreater.IsUrlCollected Then
  80:                          PromocodeCacheCreater.UrlList.Add(ClearSiteLink)
  81:                      End If
  82:                      Str1.Append(Mid(LinkText, Pos3 + 1))         'add right part of link
  83:                  End If
  84:                  End If
  85:          Else
  86:              Debug.Print("Link not found : " & LinkText)
  87:          End If
  88:          Str1.Append(Mid(HTML, LinkPosition + Len(LinkText)))    'add right HTML part outside of link
  89:          HTML = Str1.ToString
  90:          Str1 = Nothing
  91:      End Sub
  92:   
  93:  End Module

8. Enable SSI

9. Upload site Cloudflare CDN

10. Get free SSL certificate from Cloudflare

11. Enable SSL

12. My finally NGINX config

   1:  events {
   2:       worker_connections  4096;  ## Default: 1024
   3:       use epoll;
   4:  }
   5:   
   6:  http {
   7:   
   8:       include    mime.types;
   9:       default_type  application/octet-stream;
  10:   
  11:       server {
  12:   
  13:          sendfile        on;
  14:          keepalive_timeout  65;
  15:   
  16:          listen 80;
  17:          listen [::]:80;
  18:   
  18:   *** Other domain *** 
  23:   
  24:          location / {
  25:                  try_files $uri $uri/ =404;
  26:                  }
  27:   
  28:          location ~ \.php$ {
  29:                 try_files $uri =404;
  30:                 }
  31:      }
  32:   
  33:      server {
  34:   
  35:          sendfile        on;
  36:          keepalive_timeout  65;
  37:   
  38:          listen 80;
  39:          listen [::]:80;
  40:   
  41:          listen   443;
  42:          ssl    on;
  43:   
  44:          ssl_certificate    /etc/ssl/vb-net-bundle.txt;
  45:          ssl_certificate_key /etc/ssl/PrivatePemCert.txt;
  46:   
  47:   
  48:          root /var/www/vb-net.com/forum;
  49:          index Index.htm;
  50:   
  51:          server_name forum.vb-net.com;
  52:   
  53:          location / {
  54:  #          try_files '' /Index.htm =404;
  55:             try_files $uri =404;
  56:          }
  57:   
  58:          location /Forum.aspx {
  59:             try_files '' /Index.htm =404;
  60:          }
  61:   
  62:          location /reclama.html {
  63:             try_files '' /Index.htm =404;
  64:          }
  65:   
  66:          location /Reclama.aspx {
  67:             try_files '' /Index.htm =404;
  68:          }
  69:   
  70:          location /reclama.aspx {
  71:             try_files '' /Index.htm =404;
  72:          }
  73:   
  74:          location /rss.ashx {
  75:             try_files '' /Index.htm =404;
  76:          }
  77:     }
  78:   
  79:   
  80:      server {
  81:   
  82:          sendfile        on;
  83:          keepalive_timeout  65;
  84:   
  85:          listen 80;
  86:          listen [::]:80;
  87:   
  88:          listen   443;
  89:          ssl    on;
  90:   
  91:          ssl_certificate    /etc/ssl/vb-net-bundle.txt;
  92:          ssl_certificate_key /etc/ssl/PrivatePemCert.txt;
  93:   
  94:   
  95:          root /var/www/vb-net.com/html;
  96:          index Index.htm index.htm index.html Index.html;
  97:   
  98:          server_name vb-net.com www.vb-net.com;
  99:   
 100:          location / {
 101:                   ssi on;
 102:                   try_files $uri $uri/ $uri/Index.htm =404;
 103:                  }
 104:   
 105:          location /2015/Doc/ {
 106:                 ssi on;
 107:                 try_files $uri $uri/ =404;
 108:                 auth_basic "Restricted Content";
 109:                 auth_basic_user_file /2015/Doc/.htpasswd;
 110:                 }
 111:   
 112:          location /AndroidBook/Doc/ {
 113:                ssi on;
 114:                try_files $uri $uri/ =404;
 115:                auth_basic "Restricted Content";
 116:                auth_basic_user_file /AndroidBook/Doc/.htpasswd;
 117:                }
 118:   
 119:          location /ProgramTheory/Books/ {
 120:                ssi on;
 121:                try_files $uri $uri/ =404;
 122:                auth_basic "Restricted Content";
 123:                auth_basic_user_file /ProgramTheory/Books/.htpasswd;
 124:                }
 125:   
 126:   
 127:         location ~ /\. {
 128:                deny all;
 129:                access_log off;
 130:                log_not_found off;
 131:               }
 132:   
 133:        location ~ \.php$ {
 134:               try_files $uri =404;
 135:               }
 136:   
 137:       location ~ (.*)/index.htm$ {
 138:               return 301 $1/Index.htm;
 139:               }
 140:   
 141:   
 142:      }
 143:   
 144:   
 145:      server {
 146:          listen       90;
 147:          server_name  localhost;
 148:   
 149:          location /CS {
 150:              #root   /var/www/development/API
 151:              #Microservices
 152:              proxy_pass         http://localhost:5000;
 153:              proxy_http_version 1.1;
 154:              proxy_set_header   Upgrade $http_upgrade;
 155:              proxy_set_header   Connection keep-alive;
 156:              proxy_set_header   Host $host;
 157:              proxy_cache_bypass $http_upgrade;
 158:              proxy_set_header   X-Forwarded-For $proxy_add_x_forwarded_for;
 159:              proxy_set_header   X-Forwarded-Proto $scheme;
 160:          }
 161:   
 162:          location / {
 163:              #root   /var/www/development/Blazor
 164:              #Frontend
 165:              proxy_pass         http://localhost:6000;
 166:              proxy_http_version 1.1;
 167:              proxy_set_header   Upgrade $http_upgrade;
 168:              proxy_set_header   Connection keep-alive;
 169:              proxy_set_header   Host $host;
 170:              proxy_cache_bypass $http_upgrade;
 171:              proxy_set_header   X-Forwarded-For $proxy_add_x_forwarded_for;
 172:              proxy_set_header   X-Forwarded-Proto $scheme;
 173:          }
 174:      }
 175:   }


Comments ( )
Link to this page: //www.vb-net.com/MoveSiteFromWinToLinux/Index.htm
< THANKS ME>