gopher-extension.md - gopher-protocol - Gopher Protocol Extension Project HTML git clone git://bitreich.org/gopher-protocol git://enlrupgkhuxnvlhsf6lc3fziv5h2hhfrinws65d7roiv6bfj7d652fid.onion/gopher-protocol DIR Log DIR Files DIR Refs DIR Tags DIR README DIR LICENSE --- gopher-extension.md (12346B) --- 1 Gopher Extension 2 ================ 3 4 # Goals of this document 5 6 The intention is to not make radical changes to the RFC1336 standard. 7 8 This document also describes the common-used extensions to the 9 Gopher RFC and some clarifications to the wording of the RFC. 10 11 Since the publication of the RFC1436 standard around March 1993 there 12 have been developments, such as the adoption of the UTF-8 13 text-encoding and the use of SSL and later TLS encryption. 14 15 The recommendations can be therefore be seen as guidelines or 16 "SHOULD". 17 18 19 # Added Types 20 21 Types can be added, this doesn't violate the RFC specification: 22 section 3.8: "Characters '0' through 'Z' are reserved.". 23 24 These are types that are commonly used. 25 26 * The 'h' type: HTML file, with the "URL:" prefix in the selector it points to 27 an URL, see historical mail conversation (embedded and referenced below). 28 * The 'i' type, Informational message: display as text. 29 i Some message <TAB> empty selector server TAB port CR LF 30 The server and port should be included for compatibility. 31 * As mentioned in the original Gopher RFC, for other types: 32 Anything primary Text file? Use the 0 type. 33 Anything unknown or binary file? Use the 9 type and a file extension. 34 * Use the image (I) type for png, jpg etc. Make sure to set the file extension. 35 36 37 # Using the proper type for Text file or binary 38 39 * Sometimes a question comes up where PDF or XML is binary. If the 40 file is readable as text it is a text file, otherwise it is binary. 41 42 For example PDF would be using the binary 9 type and a .pdf file 43 extension. XML would be a text 0 type. 44 45 Type 0 is files which are pure text and can be displayed in a text 46 editor. 47 48 49 # Text Encoding 50 51 * The Notes section in the Gopher RFC mentions Latin1 encoding. 52 53 Recommendation: Use UTF-8 or ASCII-only for the Gopher 54 username/title field. A client may want to display the other fields, 55 so be polite and use UTF-8 or ASCII there as well if possible. 56 57 Reason: UTF-8 is a simple text-encoding and commonly used these days. 58 59 People who use Latin1 eat children. 60 61 62 # Accessibility 63 64 * Printable characters and line width, from the Gopher RFC standard: 65 66 It is *highly* recommended that the User_Name field contain only 67 printable characters, since many different clients will be using 68 it. However if eight bit characters are used, the characters 69 should conform with the ISO Latin1 Character Set. The length of 70 the user-displayable line should be less than 70 characters; longer 71 lines may not fit across some screens." 72 73 New recommendations: 74 * Don't use longer than 79 columns of UTF-8 encoded displayed "username" text. 75 * Try to reduce the amount of ASCII art which can contain non-printable 76 characters. Think of the blind or tools used to parse actual textual content. 77 78 Reason: A clarification of the term characters is needed. 79 80 81 * "The selector string should be no longer than 255 characters." 82 83 Recommendation: use no longer than 255 bytes. 84 85 Reasons for this are: 86 * A clarification of the term "characters" is needed. Characters could 87 nowadays be interpreted as unicode characters or column size of unicode 88 characters instead of bytes. 89 * Clients can simply use a static buffer to fit 255 bytes. 90 * Although Gopher does not have to map to a filesystem, filesystems typically 91 have a limit of around 255 bytes also. 92 93 94 * From section 3.5: 95 96 If a client does not understand what a, say, type 'B' item (not a core 97 item) is, then it may simply ignore the item in the directory 98 listing; the user never even has to see it. Alternatively, the item 99 could be displayed as an unknown type. 100 101 Recommendation: For clients, do not silently ignore an item, but display it 102 as an unknown type. 103 Reason: Define a recommendation for consistent behaviour in clients. 104 105 106 # Server and client handling of text file types 107 108 The RFC defines: 109 110 Textfile Entity 111 112 TextFile ::= {TextBlock} Lastline 113 114 and: 115 116 Note: Lines beginning with periods must be prepended with an extra 117 period to ensure that the transmission is not terminated early. 118 The client should strip extra periods at the beginning of the line. 119 120 and: 121 122 Note: The client should be prepared for the server closing the 123 connection without sending the Lastline. This allows the 124 client to use fingerd servers. 125 126 From section 4: 127 128 (b) The well-tempered server ought to send "text" (unless a file 129 must be transferred as raw binary). Should this text include 130 tabs, formfeeds, frufru? Probably not, but rude servers will 131 probably send them anyway. Publishers of documents should be 132 given simple tools (filters) that will alert them if there are any 133 funny characters in the documents they wish to publish, and give 134 them the opportunity to strip the questionable characters out; the 135 publisher may well refuse. 136 137 (c) The well-tempered client should do something reasonable with 138 funny characters received in text; filter them out, leave them in, 139 whatever. 140 141 The above description we think is too vague and it can be simpler. 142 143 Recommendation: handle retrieving text file types the same as binary types. 144 For clients the Lastline pattern (".\r\n") is not handled specially in this case, 145 it is part of the data. 146 For servers no preprocessing is done on the TextFile data. 147 148 Reason: Simplify the implementation of handling text types. Make the behaviour 149 of text output consistent for clients. 150 151 152 # The 'h' type: extract from the file references/h_type.txt 153 154 Below is an archived conversation about the Gopher 'h' type: 155 156 Received: with LISTAR (v1.0.0; list gopher); Tue, 12 Feb 2002 14:19:47 -0500 (EST) 157 Return-Path: <jgoerzen@complete.org> 158 Delivered-To: gopher@complete.org 159 To: gopher@complete.org 160 Subject: [gopher] Links to URL 161 From: John Goerzen <jgoerzen@complete.org> 162 Date: 12 Feb 2002 14:19:46 -0500 163 Content-type: text/plain; charset=us-ascii 164 Content-Transfer-Encoding: 8bit 165 166 I think it is best to start small with modifications to the protocol. 167 Therefore, I propose the following: 168 169 Method to link to URLs from Gopherspace 170 --------------------------------------- 171 172 1. Protocol issues 173 174 Links to URLs from a gopher directory shall be defined as follows: 175 176 Type -- the appropriate character corresponding to the type of the 177 document on the remote end; h if HTML. 178 179 Path -- the full URL, preceeded by "URL:". For instance: 180 URL:http://www.complete.org/ 181 182 Host, Port -- pointing back to the gopher server that provided 183 the directory for compatibility reasons. 184 185 Name -- as usual for a Gopher directory entry. 186 187 2. Conforming client requirements 188 189 A client adhering to this specification will, when it sees a Gopher 190 selector with a path starting with URL:, interpret the path as a URL. 191 It will ignore the host and port components of the Gopher selector, 192 using those components from the URL instead (if applicable). 193 194 3. Conforming server requirements 195 196 A server with Gopher URL support will not, in most cases, need to take 197 extra steps to provide this support beyond those outlined in 198 Compatibility below. Servers not implementing those steps outlined in 199 Compatibility will be deemed to be not in compliance. 200 201 4. Authoring compliance 202 203 The use of URL: selectors should be avoided wherever possible. In 204 particular, it should be avoided when pre-existing gopher facilities 205 exist for the type of content linked. The following URL types are 206 explicitly prohibited by this specification: 207 208 gopher 209 telnet 210 tn3270 211 212 Authors should avoid links to any document not of HTML type whenever 213 possible. Linking to non-HTML documents will break compatibility with 214 Gopher browsers that do not implement this specification. The ranks 215 of these browsers include most Web browsers, so that is a significant 216 audience. 217 218 5. Compatibility 219 220 Links to HTML pages may be accomodated even for non-comforming 221 browsers by providing additional capabilities in the server. 222 223 When a non-conforming browser is instructed to follow a link to a URL, 224 it will contact the Gopher server that provided the menu (since these 225 are specified per section 1). 226 227 When a conforming Gopher server receives a request whose path begins 228 with URL:, it will write out a HTML document that will send the 229 non-compliant browser to the appropriate place. One such conforming 230 document is: 231 232 <HTML> 233 <HEAD> 234 <META HTTP-EQUIV="refresh" content="2;URL=http://www.acm.org/classics/"> 235 </HEAD> 236 <BODY> 237 You are following a link from gopher to a web site. You will be 238 automatically taken to the web site shortly. If you do not get sent 239 there, please click 240 <A HREF="http://www.acm.org/classics/">here</A> to go to the web site. 241 <P> 242 The URL linked is: 243 <P> 244 <A HREF="http://www.acm.org/classics/">http://www.acm.org/classics/</A> 245 <P> 246 Thanks for using gopher! 247 </BODY> 248 </HTML> 249 250 This document may be any desired by the server authors, but must 251 adhere to these requirements: 252 * It must provide a refresh of a duration of 10 seconds or less 253 * It must not use IMG tags, frames, or have any reference whatsoever 254 to content outside that particular file -- other than the link 255 to the real destination. 256 * It must not use JavaScript. 257 * It must adhere to the W3C HTML 3.2 standard. 258 259 When a non-conforming Gopher client finds a reference to a HTML file 260 (type h), it will open up the file via Gopher (getting the redirect 261 document) but using a web browser. The web browser will then be 262 redirected to the actual link destination. Conforming clients will 263 follow the link directly. 264 265 END 266 267 268 # TLS support 269 270 From: 2020-06-07 Gopher TLS prototype in geomyidae by 20h at 271 <gophers://bitreich.org/0/usr/20h/phlog/2020-06-07T18-28-23-863932.md>: 272 273 # 2020-06-07 18:28:23.863932 UTC (+0000) 274 275 Gopher TLS prototype in geomyidae 276 277 We are happy and proud to announce, that there is now a prototype of 278 gopher tls in geomyidae 279 280 git://bitreich.org/geomyidae 281 282 How does it work? 283 284 When a client tries to connect via TLS, the first byte of the packet 285 will be 0x16 or 22 decimal, which is forbidden as a selector in Gopher. 286 This gives the server a hint to start TLS. Old servers will simply 287 reject such a connection attempt. 288 289 For now clic supports TLS. We are working on hurl TLS support. And for 290 sacc it is on its way. 291 292 git://bitreich.org/clic 293 git://bitreich.org/sacc 294 git://codemadness.org/hurl 295 296 Hopefully further support will come to other clients. 297 298 If you do not have anything at hand, here are some commandline clients: 299 300 Plain old Gopher: 301 302 printf "/\r\n" | nc bitreich.org 70 303 304 And with TLS: 305 306 printf "/\r\n" | socat openssl-connect:bitreich.org:70,verify=0 - 307 308 Have fun using TLS on gopher! 309 310 311 All patches and recommendations are welcome. 312 313 314 Sincerely yours, 315 316 20h 317 Senior Security Manager (SSM) 318 319 320 # Gopher TLS URI 321 322 A gopher TLS URI is the same as the Gopher URI described in RFC4266, 323 except the protocol scheme is gophers://. 324 325 When the client using the Gopher protocol does not support TLS it can 326 simply use a plain gopher:// connection. 327 328 329 # Gopher TLS downgrades 330 331 A client COULD implement the following logic: 332 333 When a user uses gophers:// then it should use TLS and not downgrade 334 automatically to a plain connection. The client COULD also show a 335 _clear_ message if the TLS connection is not accepted and offer a 336 manual downgrade option to plain-text. 337 338 When further selectors of the same host and port are accessed it should use 339 TLS automatically as well. 340 341 342 # Gopher+ compatibility 343 344 Gopher+ allows adding more TAB-separated fields to the output. For 345 Gopher, to be compatible with Gopher+ clients, it can simply accept the 346 line, but ignore these additional fields. 347 348 349 # Other references: 350 351 * RFC1436 - The Internet Gopher Protocol 352 <https://www.rfc-editor.org/rfc/rfc1436.txt> 353 or see the file references/rfc1436.txt 354 355 * RFC4266 - The gopher URI Scheme 356 <https://www.rfc-editor.org/rfc/rfc4266.txt> 357 or see the file references/rfc4266.txt 358 359 * Gopher+: 360 <https://github.com/gopher-protocol/gopher-plus/blob/main/gopherplus.txt> 361 or references/gopherplus.txt 362 363 * geomyidae Gopher server: 364 <git://bitreich.org/geomyidae> 365 366 * Helper tool to validate gopher and DirEntities: 367 <git://bitreich.org/gopher-validator> 368 369 References in this repository: <gopher://bitreich.org/1/scm/gopher-protocol>